Automatic Note Detection
|In September of 2012, we announced the launch of two new music transcription applications, PitchScope Navigator and
PitchScope Player, which both use our new Realtime Note Detection Engine.
No longer do PitchScope users have to first create Project Files and Detection Zones just to see an initial transcription or
visualization of note activity. With PitchScope Navigator, all that is necessary is to load a MP3 file containing an instrumental
solo and press the Play Button. As you hear the music, the Realtime Note Detection Engine is simultaneously detecting the
notes of the solo, animating those notes to the computer's display, and playing those detected notes through Windows
built-in Midi Synthesizer. The detected notes are also simultaneously stored to the current Notelist, which can be later saved
to hard drive or printed out in a piano-roll notation.
Since PitchScope users no longer have to take the extra time to create Detection Zones and Project Files, they can
immediately start exploring the musical content of a MP3 song as soon as they load the file. Navigator users can also perform
Realtime Note Detection with a number of Slowed-down Speeds, which slows down the music's play speed without altering the
pitch of the original instruments.
As soon as a song is loaded to Navigator, its first musical phrase is automatically transcribed and its notes appear on its
DrivingView as colored rectangles.
Since the early days of the computer and its uses in telecommunications, audio research scientists have tried to create
software which could recognize the notes of musical instruments within a digital recording, and then convert those notes to
At first this seemed to be an easy exercise in Digital Signal Processing (DSP), but they soon learned that note detection upon
polyphonic recordings (multiple instruments playing at once) would be a very difficult problem to solve -- a task so frustrating
that some audio engineers would describe it as being a futile search for the "Holy Grail of digital music processing."
Some of the earliest research of digital music was done at Massachusetts Institute of Technology (MIT), and at Standford
University's Center for Computer Research In Music and Acoustics (CCRMA). The Stanford University Center for Computer
Research in Music and Acoustics, founded by John Chowning, is a multi-discipline facility where composers and researchers
work together using computer based technology both as an artistic medium and as a research tool. The endeavor of
automatic music transcription by a computer was sometimes labeled as "Machine Listening" by researchers at MIT and
Some of the notable personalities in the research of digital music of the 1970's and 1980's were Curtis Roads, James Moorer,
and Judy Brown. Curtis Roads co-founded the International Computer Music Association in 1980, and was also an editor for
the Computer Music Journal. The Computer Music Journal was started in 1977, and for a long time served as a chronicle of
international activity in the field of computer music and digital audio. James Moorer is an internationally known figure in digital
audio and computer music. In the mid-seventies he was Co-Director and Co-Founder of the Stanford Center for Computer
Research in Music and Acoustics. Judy Brown did early research of musical perception by machines and humans, including
pitch tracking, intonation, and sound textures. She still continues to teach at MIT.
The 1980's became one of the most important decades for the research of computer music and digital audio. With the spread
of inexpensive and powerful personal computers, research was no longer limited to only large academic institutions, but was
now being pursued by an army of small companies, individual developers, and students.
One of the first steps in automatically transcribing music is to transform the time-domain of the digital signal (the data of the
MP3 file) into frequency-domain, which shows the specific frequency activity at any given moment in time. Ironically it was the
work of an 18th century mathematician, Joseph Fourier, whose research gave modern day audio engineers a method to
transform a digital audio file into a graph of its frequency content verses time.
Some audio engineers initially thought that once the digital signal's data could be converted into a frequency-domain graph,
the detection of multiple instruments' notes would be easy to deduce. True, it would be easy to determine the pitch of a one
instrument's single note -- that is the technology of guitar tuners. But when detecting multiple instruments playing at once
(polyphonic music), the problem of note detection would become dramatically more complex.
Getting a graph of frequencies was only the first part of the problem, because a single note does not occur in just one
frequency zone at a time -- a single note is actually a composite object composed of a group of different frequencies called
harmonics. So one can imagine that if each note contributed 12 harmonics, a polyphonic recording consisting of 5 instruments
would create a dense maze of overlapping, comb-like frequency activity. The real question soon became: how could this
dense maze of muddled frequency activity be deconstructed to reveal the individual contributions of each musical instrument?
In its modern day reality, analyzing digital music accurately really involves several fields: electrical engineering (spectrum
analysis, filtering, and audio transforms), psychoacoustics (sound perception), cognitive sciences (neuroscience and artificial
intelligence), acoustics (physics of sound production), and music (harmony, rhythm, and timbre).
Copyright © 2015 Creative Detectors. All rights reserved.
Automatic Music Transcription Just Got a Lot Easier.
A Brief History of Automatic Music Transcription.