Page  00000001 Simple resonators with shape control* Davide Rocchesso Dipartimento di Informatica University of Verona Strada Le Grazie 15, 37134 Verona, Italy email: Abstract This paper surveys recent research efforts aimed at defining physics-based models that are simple and robust in both computational and perceptual terms. We are interested in models that preserve the physical features, such as shape or material, that are the most relevant for our auditory perception. Attention is being paid to models of resonators, the main focus of this paper, and excitation mechanisms. 1 Introduction A few psychoacoustic investigations have shown how we are sensitive to the size and shape of objects when they are used as resonators. For example, Lakatos et al. (1997) and Kunkler-Peck and Turvey (200) have done experiments proving that we can distinguish between rectangular struck bars or plates of different heights and widths and that, more interestingly, a listener with no particular training can give a reliable estimate of these lengths. These studies indicate that we can come up with a geometric mental representation of physical objects just using their sonic fingerprint (i.e., their impulse response), as it is impressed onto the sounds produced by interaction with human beings or other objects. The material characteristics of sounding objects can also be perceived and modeled. Wildes and Richards (1988) claimed that the perception of material uses the damping characteristics of the vibrating object. Klatzky et al. (2000) recently proved this conjecture using additive synthesis of damped sinusoids to compose the stimuli for subjective tests. In a modeling context, Djoharian (2000) worked on viscoelastic models to be used in the context modal synthesis in order to "dress" a resonator model with a specific material flavor. We have contributed to prior modeling studies of 3-D acoustic resonators (cavities) where the shape control becomes an important ingredient for sound design (Rocchesso and Dutilleux 2001). For instance, a pseudo-physical model of acoustic cavity is available, where the following characteristics can *This research has been supported by the European Commission under contract IST-2000-25287 ("SOb - the Sounding Object"). be varied parametrically: absorption and diffusion properties of the enclosure, size, and shape. In particular, it is possible to perform a sort of geometric morphing between a cube and a sphere just by changing the coefficients of a set of filters. This model is based on a feedback delay network where each delay line corresponds to a set of normal modes of the cavity. Introducing shape control to a 3-D resonator model can be quite expensive unless we introduce drastic approximations at the design stage. Namely, a spherical resonator can be obtained from the reference case of a cubic resonator simulated by a feedback delay network by cascading allpass filters to the delay lines in such a way that the distribution of modes becomes inharmonic in low frequency, with a spacing between modes that is directly related to the extremal points of Bessel functions. How accurate the desing of these allpass filters has to be is an open issue that should be investigated by subjective testing. Empirically, we have found that we can oversimplify the resonator models without loosing the sense of shape. This can be done by reducing the number of modal series considered and by focusing on the low-frequency part of the inharmonicity introduced by roundness of the enclosure. The resulting models are a sort of caricature of the actual resonators, i.e. they trade realism and accuracy for magnification of spatial attributes. Similarly, we have shown (Avanzini and Rocchesso 2001) that simplified resonators, when properly excited, retain naturalness and information about the material. There are many applications that can benefit from these "cartoon" resonator models, especially in the field of auditory display. In fact, in all the cases where data can be visualized by simple shapes of varying size we have a fertile ground for direct sonification by means of geometry-based resonator models. In these applications visualization can indeed be augmented by sonification and the display can keep its coherence across different modalities. 2 Simple resonators The simplest resonator is the second-order damped oscillator. Despite its semplicity, it is a fundamental component in

Page  00000002 many real-world mechanical, fluid-dynamic, or electric phenomena. Indeed, as we show in (Avanzini and Rocchesso 2001), when a mechanical second-order oscillator (mass-spring-damper system) is properly excited it can provide reliable information about the material it is made of. However, no shape information can be conveyed by a single-resonance system, as shapes are differentiated by different distributions of spectral peaks. In the last year, we have been conducting studies aimed at improving our understanding of acoustic shape perception and at finding models of minimal complexity that retain shape information. 2.1 The perception of shape We started from the closed-form distributions of resonances in cubic and spherical cavities. A rectangular resonator has a frequency response that is the superposition of harmonic combs, each having a fundamental frequency f0,imn = J (l 2+ (m/Y)2 + (/)2, (1) where c is the speed of sound, 1, m, n is a triple of positive integers with no common divisor, and X, Y, Z are the edge lengths of the box (Morse and Ingard 1968). A spherical resonator has a frequency response that is the superposition of inharmonic combs, each having peaks at the extremal points of spherical Bessel functions. Namely, said zns the sth root of the derivative of the nth Bessel function, the resonance frequencies are found at fns 2= ans (2) where a is the radius of the sphere (Moldover et al. 1986). To test the hypothesis that shape information is carried by the distribution of resonances, we synthesised impulse responses by additive summation of modal contributions 1. We damped the modes according to prescribed absorbing characteristics of the enclosure, introducing some randomness in mode amplitudes to resemble excitation and pickup at different points, and stopping the quadratic frequency-dependent increase in modal density at the point where the single modes are no longer discriminable. This gave us complex impulse responses that could be reliably used, after convolution with probing sounds, to test the ability of subjects to discriminate different shapes. Results of these experiments are reported in Rocchesso (2001), and they show that the average listener is capable to classify sounds played in different shapes properly, at least if the diameter is larger than 50cm. Indeed, a short training had to be given to subjects before running the tests, so that one might argue that they were compulsorily labeling sounds rather than recognizing shapes. This is true, but it is also true that what changes between the two categories is SThe Matlab routines used are available from the SOb project website: only the spatial signature of the resonators, thus meaning that 3D shape control can be effectively used to elicit different sonic impressions. The short training allowed the subjects to learn how to extract the shape-related features from other acoustic features, such as pitch. 2.2 The problem of pitch When comparing the impulse response of a spherical cavity with that of a cubic cavity, the most evident perceptual difference that one gets is that one sounds higher than the other. This means that there is a pitch relationship between the two shapes, even though the two responses may not have a well-defined musical pitch. If we want to use shape as a control parameter for resonators, it is important to decouple it from pitch control. Therefore, we asked several subjects to compare the responses of cubes and spheres in pitch terms. The results indicated that subjects are maximally confused when the two shapes are roughly equalized in volume, thus meaning that a good pitch equalization is obtained with equal volumes. These experiments, described in Rocchesso and Ottaviani (2001), were conducted with non-expert listeners, except for a few subjects who are trained musicians. Figure 1 (left) reports the mean and standard deviation of the pitch judgement for cubic boxes of different sizes in comparison with a ball (2a = 1.0m), for 14 subjects. On the right plot of the same figure we have the difference of positive and negative responses, the peak corresponding to the point of maximal confusion, which can be identified with the point of equal pitches. Remarkably, musicians had more difficulties in relating the pitches of two different shapes, as they seemed to adopt an analytic, rather than holistic, listening mode. Indeed, none of the responses of the two shapes is harmonic or close to harmonic, and multiple pitches can be discovered by careful examination. sphere vs. cube pitch comparison pitch proximity index Figure 1: Mean and standard deviation of pitch comparison between a sphere (d = 100cm) and several cubes.

Page  00000003 2.3 Simplifying the models The models used for listening tests were quite complex, being formed by summation of hundreds of damped sinusoids. One of our initial goals was to design simple models that retain the basic characteristics, such as the coarse shape, that one might want to control in a resonator. One possibility is to use the model presented in Rocchesso and Dutilleux (2001), that is based on a feedback delay network with allpass filters in the feedback loop. The allpass filter coefficients and the delay lengths can be designed so that the structure approximates, with different parameter settings, the distribution of resonances found in a cubic or in a spherical resonator. This approach is attractive, especially if we want to reproduce a rich reverberant character in the impulse response. In such case, we really need high modal and echo densities, and a high order feedback delay network is well suited for this purpose. However, the quality of reverberation, essentially for its statistical nature, is not directly related to the cavity shape. Therefore, if we are just interested in preserving the shape signature, it is likely that a simpler model could do the job. It is intuitively expected that the low-frequency resonances bring most of the shape-related information. But which are the indispensable modes and where should we stop adding modes that do not improve the shape impression? We tried using the correlogram to answer this question, as such representation is based on a model of the peripheral hearing system. The correlogram is a representation of sound as a function of time, frequency, and periodicity (Slaney and Lyon 1993). Each sound frame is passed through a cochlear model and split into a number of cochlear channels, each representing a certain frequency band. Cochlear channels are nonlinearly spaced according to the critical bands. The signal in each band is autocorrelated to highlight its periodicities. The autocorrelation magnitude is expressed in grey levels in a 2-D plot. Figure 2 depicts the correlogram of the impulse response of a cube (edge length equal to 0.5m) and that of the impulse response of a sphere having the same volume. Superimposed on the figure, we notice some patterns that emerge from the analysis and that, we conjecture, are the signature of the particular shape. A close analysis of these representations shows that the curved part of the pattern outlined in fig. 2 can be reproduced, for the sphere, just by retaining the modal frequencies fil, f22, f02, and f42 defined by (2). Fig. 3 shows the correlogram of the response obtained just by summation of those lowfrequency modes, for a sphere having diameter 2a = 0.25m. 3 Controlling the models Certain transformations have proved to be much more effective in the visual domain than in the auditory domain. One such transformation is morphing, that is gradually transform 20 4C 60 8C 2C E 40 60 80 Correlogram Sphere - frame 1/37 50 100 150 200 250 Lag [samples] Correlogram Cube - frame 1/30 K...: ~I 4 50 100 150 Lag [samples] 200 250 Figure 2: Correlograms for the cube (edge 0.5m) and for the sphere having the same volume. so 40 60 80 Correlogram Sphere - frame 1/19 - XX:.:s:.:.a:.:s 50 100 150 Lag [samples] 200 250 Figure 3: Correlogram for a simplified model of spherical resonator that retains only four resonances. ing one shape into another one. Auditory morphing, that is the gradual transformation of a sound source, has been a testbed for sound modeling techniques for many years, but the construction of convincing morphisms has proved to be a difficult task due to the peculiar behavior of our auditory perception. With shape-controlled resonators, at least we can approach narrowly-scoped morphing operations, such as changing the degree of roundness of an object, and easily match them with the visual counterpart. The problem here is that we have closed-form resonance distributions for simple shapes, such as spheres or cubes, but we do not know how the resonances move when interpolating between them. To understand that, we used waveguide mesh models of 3D cavities and we simulated several shapes, intermediate between the sphere and the cube (Fontana and Rocchesso 2001). Inspired by similar works done by computer graphics researchers, we used the superellipsoid, whose parameters can be used to control the degree of roundness, and we derived curves for varying the position of significant resonances accordingly. To derive this kind of control curves, the waveguide mesh proved to be a reliable tool. In particular, the triangular meshes in 2D and 3D have uniform dispersion characteristics, and that translates into preservation of the distribution of resonances

Page  00000004 with, at most, some high-frequency warping. Moreover, nonlinearities, lumped loads, and boundary conditions, can be inserted in the mesh quite intuitively. However, the waveguide mesh paradigm is simple and effective when wave propagation is close to ideal, namely for flexible membranes and airfilled cavities. For propagation in non-ideal media, such as stiff plates, it does not seem to give any significant advantage over finite difference methods (Bilbao 2001). In the transition between a sphere and a cube, when the superellipsoid is used at intermediate stages, the most significant low-frequency resonances can be tracked by quasi-linear trajectories along a logarithmic frequency scale (Fontana and Rocchesso 2001). If, in this morphing operation, the number and intensity of resonances is kept fixed, there is a significant decrease in brightness going towards the cube. Brightness is the most robust perceptual attribute of timbre and, therefore, such change in brightness is likely to obscure the change in timbral features due to shape change. Indeed, when full-band responses from the mesh simulation or from extensive additive synthesis are used, such brightness effect is much less relevant, as one expects it to be with real cavities. There are several ways to keep the overall brightness roughly constant for constant-volume shape morphing. Fig. 4 shows an oversimplified realization where only the four main resonances are retained, linear interpolation of resonance frequencies is used at intermediate stages, and brightness is preserved through broadband, shape-dependent filtering. lipsoids, or cylinders that evolve into tubes having a square section. In 2-D, we can simulate membranes with varying roundness of the rim, or plates of different shape and material. These resonators can be coupled with feed-forward excitations or actual feedback mechanisms such as those found in friction or fluid-dynamic valves. To this end, the work done with the impact model (Avanzini and Rocchesso 2001) represents a useful reference case that is being extended to continuous-excitation mechanisms. References Avanzini, F. and D. Rocchesso (2001). Controlling material properties in physical models of sounding objects. In Proc. Int. Comp. Music Conf., La Habana, Cuba. Bilbao, S. (2001). Wave and Scattering Methods for the Numerical Integration of Partial Differential Equations. Ph. D. thesis, Stanford University. available from http: // bilbao. Djoharian, P. (2000). Shape and material design in physical modeling sound synthesis. In Proc. Int. Comp. Music Conf, Berlin, pp. 38-45. Fontana, F. and D. Rocchesso (2001). Acoustic cues from shapes between spheres and cubes. In Proc. Int. Comp. Music Conf, La Habana, Cuba. Klatzky, R. L., D. K. Pai, and E. P. Krotov (2000). Perception of Material from Contact Sounds. Presence 9(4), 399-410. Kunkler-Peck, A. and A. Turvey (200). Hearing shape. J. Exp. Psych. Human. 26(1), 279-294. Lakatos, S., S. McAdams, and R. Causse (1997). The representation of auditory source characteristics: simple geometric form. Percept. Psychophys. 59(8), 1180-1190. Moldover, M. R., J. B. Mehl, and M. Greenspan (1986, February). Gas-filled spherical resonators: Theory and experiment. J. Acoust. Soc. Am 79(2), 253-272. Morse, P. M. and K. U. Ingard (1968). Theoretical Acoustics. New York: McGraw-Hill. Reprinted in 1986, Princeton Univ. Press, Princeton, NJ. Rocchesso, D. (2001). Acoustic cues for 3-d shape information. In Proc. Int. Conf Auditory Display, Espoo, Finland. Rocchesso, D. and P. Dutilleux (2001). Generalization of a 3-D resonator model for the simulation of spherical enclosures. Appl. Signal Proc. 2001(1), 15-26. Rocchesso, D. and L. Ottaviani (2001). Can one hear the volume of a shape? In Proc. IEEE Workshop Appl. Sig. Proc. Audio and Ac., Mohonk, NY. Slaney, M. and R. F. Lyon (1993). On the importance of time - a temporal representation of sound. In M. Cooke and S. B. M. Crawford (Eds.), Visual Representations of Speech Signals, pp. 409-429. Sussex, UK: J. Wiley and Sons. Wildes, R. P. and W. A. Richards (1988). Recovering material properties from sound. In W. A. Richards (Ed.), Natural Computation, pp. 357-363. Cambridge, MA: MIT Press. Figure 4: Sonogram of morphing a cube into a sphere. Excitation signal is white noise. The extremal shapes have the same volume, and linear interpolation of resonance frequencies is used at intermediate stages. 4 Current and Future Research We set up a theoretical and experimental framework for designing and understanding simple resonator models and excitation mechanisms, and we are using this framework for our research, especially in the EU-funded project "the Sounding Object". We can readily develop "cartoon" models for 3 -D enclosures such as rectangular boxes that evolve into el