Real-time audio analysis tools for Pd and MSP
Miller S. Puckette, UCSD (
[email protected])
Theodore Apel, CRCA, UCSD (
[email protected])
David D. Zicarelli, Cycling74 (www.cycling74.com)
Abstract
Two "objects," which run under Max/MSP or Pd, do different kinds of real-time analysis of
musical sounds. Fiddle is a monophonic or polyphonic maximum-likelihood pitch detector similar
to Rabiner's, which can also be used to obtain a raw list of a signal's sinusoidal components. Bonk
does a bounded-Q analysis of an incoming sound to detect onsets of percussion instruments in a
way which outperforms the standard envelope following technique. The outputs of both objects
appear as Max-style control messages.
1 Tools for real-time audio
analysis
The new real-time patchable software synthesizers
have finally brought audio signal processing out of the
ivory tower and into the homes of working computer
musicians. Now audio can be placed at the center
of real-time computer music production, and MIDI,
which for a decade was the backbone of the electronic
music studio, can be relegated to its appropriate role
as a low-bandwidth I/O solution for keyboards and
other input devices. Many other sources of control
"input" can be imagined than are provided by MIDI
devices. This paper, for example, explores two possibilities for deriving a control stream from an incoming
audio stream.
First, the sound might contain quasi-sinusoidal
"partials" and we might wish to know their frequencies and amplitudes. In the case that the audio stream comes from a monophonic or polyphonic
pitched instrument, we would like to be able to determine the pitch(es) and loudness(es) of the components. It's clear that we'll never have a perfect pitch
detector, but the fiddle object described here does
fairly well in some cases.
For the many sounds which don't lend themselves
to sinusoidal decomposition, we can still get useful
information from the overall spectral envelope. For
instance, rapid changes in the spectral envelope turn
out to be a much more reliable indicator of percussive
attacks than are changes in the overall power reported
by a classical envelope follower. The bonk object does
a bounded-Q filterbank of an incoming sound and
can either output the raw analysis or detect onsets
which can then be compared to a collection of known
spectral templates in order to guess which of several
possible kinds of attack has occurred.
The fiddle and bonk objects are low tech; the
algorithms would be easy to re-code in another language or for other environments from the ones considered here. Our main concern is to get predictable and
acceptable behavior using easy-to-understand techniques which won't place an unacceptable computational load on a late-model computer.
Some effort was taken to make fiddle and bonk
available on a variety of platforms. They run under Max/MSP (Macintosh), Pd (Wintel, SGI, Linux)
and fiddle also runs under FTS (available on several platforms.) Both are distributed with source
code; see http://manl04nfs.ucsd. edu/~mpuckett/
for details.
2 Analysis of discrete spectra
Two problems are of interest here: getting the frequencies and amplitudes of the constituent partials
of a sound, and then guessing the pitch. Our program follows the ideas of [Noll 69] and [Rabiner 78].
Whereas the earlier pitch" object reported in [Puckette 95] departs substantially from the earlier approaches, the algorithm used here adhere more closely
to them.
First we wish to get a list of peaks with their
frequencies and amplitudes. The incoming signal is