ï~~particular sound set. Figure 1 demonstrates how to generate a feature list composed of MFCCs, spectral centroid,
and spectral brightness. Subsets of mel-frequency cepstral
coefficients (MFCCs) are frequently used for economically
representing spectral envelope, while spectral centroid and
brightness provide information about the distribution of spectral energy in a signal. Each time the button in the upper
right region of the patch is clicked, a multi-feature analysis
snapshot composed of these features will be produced.
Figure 1. Generating a mixed feature list.
Capturing the temporal evolution of audio features requires some additional logic. In Figure 2, a single feature list
is generated based on 5 successive analysis frames, spaced
50 milliseconds apart. The attack of a sound is reported
by bonk~ , turning on a metro that fires once every 50
ms before turning off after almost a quarter second. Via
list prepend, the initial moments of the sound's temporallyevolving MFCCs are accumulated to form a single list. By
the time the fifth mel-frequency cepstrum measurement is
added, the complete feature list is allowed to pass through a
spigot for routing to timbrelD, the classification object described below in section 3. Recording changes in MFCCs
(or any combination of features) over time provides detailed
information for the comparison of complex sounds.
These patches illustrate some key differences from the
Pd implementation of libXtract, a well developed multi-platform feature extraction library described in . Extracting
features in Pd using the libXtract~ wrapper requires subpatch blocking, Hann windowing, and an understanding of
libXtract's order of operations. For instance, to generate
MFCCs, it is necessary to generate magnitude spectrum with
a separate object, then chain its output to a separate MFCC
object. The advantage of libXtract's cascading architecture
is that the spectrum calculation occurs only once, yet two or
more features can be generated from the results.
While timbrelD objects are wasteful in this sense (each
object redundantly calculates its own spectrum), they are
more efficient with respect to downtime. As mentioned above,
features are not generated constantly, only when needed.
Further, from a user's perspective, timbrelD objects require
Figure 2. Generating a time-evolving feature list.
less knowledge about analysis techniques, and strip away
layers of patching associated with blocking and windowing.
In order to have maximum control over algorithm details, all feature extraction and classification functions were
written by the author, and timbrelD has no non-standard library dependencies.
3. THE CLASSIFICATION OBJECT
Features generated with the objects described in section 2
can be used directly as control information in real-time performance. In order to extend functionality, however, a multipurpose classification external is provided as well. This object, timbrelD, functions as a storage and routing mechanism that can cluster and order the features it stores in memory, and classify new features relative to its database. Apart
from the examples package described in the following section, an in-depth help patch accompanies timbrelD, demonstrating how to provide it with training features and classify
new sounds based on training. Figure 3 depicts the most
basic network required for this task.
Training features go to the first inlet, and features intended for classification go to the second inlet. Suppose the
patch in Figure 3 is to be used for percussive instrument
classification. In order to train the system, each instrument