Real-time audio analysis tools for Pd and MSP Miller S. Puckette, UCSD ([email protected]) Theodore Apel, CRCA, UCSD ([email protected]) David D. Zicarelli, Cycling74 (www.cycling74.com) Abstract Two "objects," which run under Max/MSP or Pd, do different kinds of real-time analysis of musical sounds. Fiddle is a monophonic or polyphonic maximum-likelihood pitch detector similar to Rabiner's, which can also be used to obtain a raw list of a signal's sinusoidal components. Bonk does a bounded-Q analysis of an incoming sound to detect onsets of percussion instruments in a way which outperforms the standard envelope following technique. The outputs of both objects appear as Max-style control messages. 1 Tools for real-time audio analysis The new real-time patchable software synthesizers have finally brought audio signal processing out of the ivory tower and into the homes of working computer musicians. Now audio can be placed at the center of real-time computer music production, and MIDI, which for a decade was the backbone of the electronic music studio, can be relegated to its appropriate role as a low-bandwidth I/O solution for keyboards and other input devices. Many other sources of control "input" can be imagined than are provided by MIDI devices. This paper, for example, explores two possibilities for deriving a control stream from an incoming audio stream. First, the sound might contain quasi-sinusoidal "partials" and we might wish to know their frequencies and amplitudes. In the case that the audio stream comes from a monophonic or polyphonic pitched instrument, we would like to be able to determine the pitch(es) and loudness(es) of the components. It's clear that we'll never have a perfect pitch detector, but the fiddle object described here does fairly well in some cases. For the many sounds which don't lend themselves to sinusoidal decomposition, we can still get useful information from the overall spectral envelope. For instance, rapid changes in the spectral envelope turn out to be a much more reliable indicator of percussive attacks than are changes in the overall power reported by a classical envelope follower. The bonk object does a bounded-Q filterbank of an incoming sound and can either output the raw analysis or detect onsets which can then be compared to a collection of known spectral templates in order to guess which of several possible kinds of attack has occurred. The fiddle and bonk objects are low tech; the algorithms would be easy to re-code in another language or for other environments from the ones considered here. Our main concern is to get predictable and acceptable behavior using easy-to-understand techniques which won't place an unacceptable computational load on a late-model computer. Some effort was taken to make fiddle and bonk available on a variety of platforms. They run under Max/MSP (Macintosh), Pd (Wintel, SGI, Linux) and fiddle also runs under FTS (available on several platforms.) Both are distributed with source code; see http://manl04nfs.ucsd. edu/~mpuckett/ for details. 2 Analysis of discrete spectra Two problems are of interest here: getting the frequencies and amplitudes of the constituent partials of a sound, and then guessing the pitch. Our program follows the ideas of [Noll 69] and [Rabiner 78]. Whereas the earlier pitch" object reported in [Puckette 95] departs substantially from the earlier approaches, the algorithm used here adhere more closely to them. First we wish to get a list of peaks with their frequencies and amplitudes. The incoming signal is
Top of page Top of page