Since the early days of the computer and its uses in telecommunications, audio research scientists have tried to create software which
could recognize the notes of musical instruments within a digital recording, and then convert those notes to musical notation.
At first this seemed to be an easy exercise in Digital Signal Processing (DSP), but they soon learned that note detection upon polyphonic
recordings (multiple instruments playing at once) would be a very difficult problem to solve -- a task so frustrating that some audio
engineers would describe it as a futile search for the "Holy Grail of digital music processing."
Some of the earliest research of digital music was done at Massachusetts Institute of Technology (MIT), and at Standford University's
Center for Computer Research In Music and Acoustics (CCRMA). The Stanford University Center for Computer Research in Music and
Acoustics, founded by John Chowning, is a multi-discipline facility where composers and researchers work together using computer
based technology both as an artistic medium and as a research tool. The endeavor of automatic music transcription by a computer was
sometimes labeled as "Machine Listening" by researchers at MIT and Stanford University.
Some of the notable personalities in the research of digital music of the 1970's and 1980's were Curtis Roads, James Moorer, and Judy
Brown. Curtis Roads co-founded the International Computer Music Association in 1980, and was also an editor for the Computer Music
Journal. The Computer Music Journal was started in 1977, and for a long time served as a chronicle of international activity in the field of
computer music and digital audio. James Moorer is an internationally known figure in digital audio and computer music. In the
mid-seventies he was Co-Director and Co-Founder of the Stanford Center for Computer Research in Music and Acoustics. Judy Brown
did early research of musical perception by machines and humans, including pitch tracking, intonation, and sound textures. She still
continues to teach at MIT.
The 1980's became one of the most important decades for the research of computer music and digital audio. With the spread of
inexpensive and powerful personal computers, research was no longer limited to only large academic institutions, but was now being
pursued by an army of small companies, individual developers, and students.
One of the first steps in automatically transcribing music is to transform the time-domain of the digital signal (the data of the MP3 file) into
frequency-domain, which shows the specific frequency activity at any given moment in time. Ironically it was the work of an 18th century
mathematician, Joseph Fourier, whose research gave modern day audio engineers a method to transform a digital audio file into a graph
of its frequency content verses time.
Some audio engineers initially thought that once the digital signal's data could be converted into a frequency-domain graph, the
detection of multiple instruments' notes would be easy to deduce. True, it would be easy to determine the pitch of a one instrument's
single note -- that is the technology of guitar tuners. But when detecting multiple instruments playing at once (polyphonic music), the
problem of note detection would become dramatically more complex.
Getting a graph of frequencies was only the first part of the problem, because a single note does not occur in just one frequency zone at
a time -- a single note is actually a composite object composed of a group of different frequencies called harmonics. So one can imagine
that if each note contributed 12 harmonics, a polyphonic recording consisting of 5 instruments would create a dense maze of
overlapping, comb-like frequency activity. The real question soon became: how could this dense maze of muddled frequency activity be
deconstructed to reveal the individual contributions of each musical instrument?
In its modern day reality, analyzing digital music accurately really involves several fields: electrical engineering (spectrum analysis,
filtering, and audio transforms), psychoacoustics (sound perception), cognitive sciences (neuroscience and artificial intelligence),
acoustics (physics of sound production), and music (harmony, rhythm, and timbre).
|Automatic Music Transcription Just Got a Lot Easier.
Copyright © 2017 Creative Detectors. All rights reserved.
In September of 2012, we announced the launch of two new music transcription applications, PitchScope Navigator and PitchScope
Player, which both use our new Realtime Note Detection Engine.
No longer do PitchScope users have to first create Project Files and Detection Zones just to see an initial transcription or visualization of
note activity. With PitchScope Navigator, all that is necessary is to load a MP3 file containing an instrumental solo and press the Play
Button. As you hear the music, the Realtime Note Detection Engine is simultaneously detecting the notes of the solo, animating those
notes to the computer's display, and playing those detected notes through Windows built-in Midi Synthesizer. The detected notes are also
simultaneously stored to the current Notelist, which can be later saved to hard drive or printed out in a piano-roll notation.
Since PitchScope users no longer have to take the extra time to create Detection Zones and Project Files, they can immediately start
exploring the musical content of a MP3 song as soon as they load the file. Navigator users can also perform Realtime Note Detection with a
number of Slowed-down Speeds, which slows down the music's play speed without altering the pitch of the original instruments.
As soon as a song is loaded to Navigator, its first musical phrase is automatically transcribed and its notes appear on its DrivingView as
A Brief History of Automatic Music Transcription