ï~~sible within Pd using GEM for two- and three-dimensional plotting. In the provided example, the axes of the space can be assigned to a number of different spectral features, including zero crossing rate, amplitude, frequency, or any of 47 Bark-frequency cepstral coefficients. By editing the analysis sub-patch, additional features can be included. Figure 6 shows speech grains plotted in a space where values of the second and third BFCCs are mapped to the x and y axes respectively. RGB color can be mapped to any available features as well. Mousing over a point in the space plays back its appropriate grain, enabling exploration aimed at identifying regions of timbral similarity. The upper left region of figure 6 contains a grouping of "sh" sounds, while the central lower region contains a cluster of "k" and "ch" grains. Other phonemes can be located as well. In order to explore dense regions of the plot, keyboard navigation can be enabled to zoom with respect to either axis (or both simultaneously), and move up, down, left, or right in the space. Figure 7. 2400 string grains mapped with respect to amplitude and fundamental frequency. Figure 7 shows a plot of string sample grains mapped according to RMS amplitude and fundamental frequency. Because the frequencies in this particular sound file fall into discrete pitch classes, its grains are visibly stratified along the vertical dimension. Mapping is achieved by recovering features from timbrelD's database with the "featurelist" message, which is sent with a database index indicating which instance to report. The feature list for the specified instance is then sent out of timbrelD's fifth outlet, and used to determine the instance's position in feature space. 5. CONCLUSION This paper has introduced some important features of the timbrelD analysis/classification toolkit for Pd, and demonstrated its adaptability to four unique tasks. Pd external source code, binaries, and the example patches described above are all available for download at the author's website: www.williambrent.com. The remaining patches in the example package-a cepstrogram plotting interface and a percussion classification system that identifies instruments immediately upon attack-were not described. The example patches are simple in some respects and are intended to be starting points that can be expanded upon by the user. Future development will be focused on adding new features to the set of feature extraction objects, implementing a kD-tree for fast searching of large databases in order to make concatenative synthesis more efficient, and developing strategies for processing multiple-frame features of different lengths in order to compare sounds of various durations. 6. REFERENCES [1] W. Brent, "Cepstral analysis tools for percussive timbre identification," in Proceedings of the 3rd International Pure Data Convention, Sio Paulo, Brazil, 2009. [2] J. Bullock, "Libxtract: A lightweight library for audio feature extraction," in Proceedings of the International Computer Music Conference, 2007. [3] M. Casey and M. Grierson, "Soundspotter/remix-tv: fast approximate matching for audio and video performance," in Proceedings of the International Computer Music Conference, Copenhagen, Denmark, 2007. [4] G. Holmes, A. Donkin, and I. Witten, "Weka: a machine learning workbench," in Proceedings of the second Australia and New Zealand Conference on Intelligent Information Systems, Brisbane, Australia, 1994, pp. 357-361. [5] S. K6nig, http://www.popmodernism.org/scrambledhackz. [6] M. Puckette, T. Apel, and D. Zicarelli, "Real-time audio analysis tools for pd and msp," in Proceedings of the International Computer Music Conference, 1998, pp. 109-112. [7] D. Schwarz, G. Beller, B. Verbrugghe, and S. Britton, "Real-time corpus-based concatenative synthesis with catart," in Proceedings of the COST-G6 Gonference on Digital Audio Effects (DAFx), Montreal, Canada, 2006, pp. 279-282. [8] G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293-302, 2002. [9] X. Zhang and Z. Ras, "Analysis of sound features for music timbre recognition," in Proceedings of the IEEE CS International Conference on Multimedia and Ubiquitous Engineering, 2007, pp. 3-8. 229
Top of page Top of page