Page  52 ï~~Acoustic Quanta Alan PURVIS + Alan. Purvis @ Douglas J. E. NUNN + D.J.E.Nunn Peter D. MANNING # P.D.Manning @ Durham Music Technology* WWW page: + School of Engineering, University of Durham South Road, DURHAM DH1 3LE, UK # Department of Music, University of Durham Palace Green, DURHAM DH1 3RL, UK Abstract This paper presents a novel approach to sound analysis and resynthesis based on Gabor wavelets. These wavelets are mathematically simple and computationally modest, and are well suited to musical applications. This paper outlines the principles and equations behind quanta. It then gives examples of their applications in analysis and synthesis, and presents results from implementation on the PC. 1. Introduction A landmark paper by Dennis Gabor examined how quantum theory could be applied to acoustic signals.[Gabor 1947] It discussed how any signal may be built up from elementary signals that are characterised by a few parameters. These signals are wavelets consisting of a complex sinusoid multiplied by a Gaussian envelope. Since then many of his ideas have been absorbed by the comparatively new field of wavelets. Wavelets are a good alternative to the short-time Fourier transform (STFT) as they allow a wide range of scales with an approximately constant Q factor. Gabor wavelets, or quanta to use Gabor's term, can be used at the analysis, transformation, and synthesis stages. In analysis, they offer an alternative to block-based coding, but can be derived from the STFT or multirate STFT. It has been shown that the Gabor expansion allows complete reconstruction.[Bastiaans 1985] Arfib used Gabor wavelets to successfully carry out timestretching, a seemingly simple task but difficult in practice.[Arfib 1991 ] Many common audio transformations can be expressed compactly as simple linear operations between groups of quanta. Synthesis using quanta is in many respects similar to granular synthesis, except that each has a single frequency and is (conceptually) infinitely long. The quanta can be calculated efficiently using a computationally inexpensive recursive algorithm. m.ea.e(tto)2 Here, to and f0 (real) are the positions in time and frequency, and m is the complex magnitude. (It is assumed that either quanta occur in conjugate pairs or that we are dealing with the analytical signal.) The width in time, At, is given by 4/(ir/a), and the width in frequency, Af, by 4(a/t). The parameter a has the dimensions of s'2 or Hz2, and will be referred to either as alfa (to reduce confusion with Gabor's correctly-spelled alpha, which is 'la), or the density of the quantum. A high alfa represents a short broadband signal; a low alfa represents a long pure signal. For the signal to have finite power, alfa must be positive. Figure 1 shows a typical quantum. Figure 1: Waveform of a single quantum 2. Definitions In this paper, quanta are defined as: *Durham Music Technology is a collaboration between the School of Engineering and the Department of Music at Durham University. Nunn et al. 52 ICMC Proceedings 1996

Page  53 ï~~The parameters are stored as 32-bit floats, so one quantum requires 160 bits of memory. A signal that is finite in one domain must be infinite in the other. Gabor wavelets are infinite in both domains, but in one sense are the most compact - they have a time-bandwidth product of unity. This property gives the advantages of mathematical simplicity and symmetry, but the disadvantage that each wavelet covers the entire time-frequency plane. Most other wavelets are finite in the time domain in order to give a fast integral transform, but this is at the expense of less ideal frequency localisation.[Daubechies 1988] 3. Basic Operations The simple mathematical form of the Gabor wavelet allows many straightforward transformations. If Qo=Q(to, fo, ao, mo) and Q1=Q(t1, fl, a1, im) then Qo*QI=QI*Qo=Q(tk, fk, ak, mk), where mk = mo*ml*ez Z = (ao*ai/ak)*(to-t,)2 ak = ao+ai fk = fo+f1 tk = (ao*to+a,*tl)/ak The convolution of two quanta corresponds to multiplication in the spectral domain. The Fourier transform of Q(t,f,a,m) is Q(f,t,rrE/a,m). With Qo and Q1 defined as before, their convolution Qo *Q1 = Q1 *Qo = Q(tk, fk, ak, mk), where consideration in designing higher-level structures for quanta. The second-lowest unit is called an atom. It represents an arbitrary number of quanta, and its species determines the topology of the arrays. There are sixteen species, corresponding to whether there are multiple times, frequencies, densities, and/or magnitudes. A species-0 atom is a single quantum. Species 8 has multiple times but only one frequency, density, and magnitude, and could describe a rhythm. A chord could be a group of quanta with the same times and densities but different frequencies and magnitudes, which is species 5. The notation for quanta is extended using braces such that, for example, Q({ tO,tl,t2,t3 ),f,a,m) represents a species-8 atom of four quanta with different times but the same f, a, and m. Note that while there may well be interesting analogies between acoustic quanta and quantum physics, the term 'atom' is chosen purely out of the need for a term, rather than any direct physical analogy. The third-lowest unit is a molecule, which is a group of atoms, possibly of different species. The addition, multiplication, or convolution of molecules is simply defined as the sum of the result of the operations between their atoms. The range of musical transformations that can be implemented easily is best illustrated by some examples. If one atom holds a weighted set of frequencies Q(0, { fO,fl,f2,... },0, { mO,ml,m2,... }), we can apply a control envelope to them by simply multiplying the atom by another one corresponding to the control envelope. Since these operations are carried out with the 'tokenised' quanta rather than the actual audio, complex effects can be specified simply, although the resultant number of quanta may be large. If we have formed a set of quanta corresponding to a note, then the convolution with Q(10,0,,,,1) gives the same note delayed by 10 seconds. Similarly, convolution with Q({0,0.1,0.2,0.3}, 0, oo, (1,0.3,0.1,0.03 }) gives a simple echo. Filtering can be carried out by convolving the input quanta with Q(O,fcente,aarge, 1). To form more complex filters, we use a set of Gaussians that sum to the desired response in the frequency domain. 5. Synthesis in music synthesis, the creation of music from an aggregate of shorter acoustic events would, be classed as granular synthesis. The approach outlined differs from conventional granular mk z ak fk = mo*mi*ez = (_ 2*(fo-fl)2)/(ao+al) = ao*al/(ao+al) = (fo*al+fl*ao)/(ao+al) tk = t0+tl Transformations in the time and frequency domains are equally easy. In fact, we can consider multiplication and convolution as a single operation with the focus on either the time or the frequency axis. Impulses are a special case - m.5(t) is equivalent to Q(t,0,oo,m.oo) but is actually denoted by Q(t,0,oo,m). Impulses are commonly used and must be handled robustly by the low-level routines. 4. Higher-level structures In most cases it is expected that quanta will be grouped together, and that they might have one or more parameters in common. In order to minimise memory usage, this was made a fundamental ICMC Proceedings 1996 53 Nunn et al.

Page  54 ï~~synthesis, in that quanta do not have a start or end, and only contain a single frequency. Ultimately, quanta must be converted to equallyspaced samples, and this can be done recursively.[Jones et al., 1987, Kaiser 1987] As only the real part is required, only 4 real multiplies and one addition need be performed per sample. This leads to fast synthesis, even on standard PC hardware. The wavelet need only be calculated until it falls below the minimum signal level. With 16-bit integer coding being both commonplace and readily supported, it is most convenient to operate with 16 -bit ints. If the magnitude is much less than the maximum, as we would frequently expect, then less calculation is required. Two composition interfaces were implemented. The first available composition interface is to use the same language as the synthesis engine, in this case C. This means that we can script a composition using the full range of C control structures and algorithms. However, this method also has the drawbacks that it is less intuitive and that compilation causes a sizeable delay between conceiving a musical concept and hearing it. 6. Graphical Interface In an attempt to form a more intuitive interface, a GUI was developed. Figure 2 is a screen shot. U1;'0.2 (,iii t ( o. Edi -,.t--l. L Â~, - t utir ' ( Â~., ' 1 (.,.On 4 t tur ICanIt Unit j.."r u4q i r ).,.?.z*, l.-i ( s t-..., Ds i.. ' Ai Z on I;Â~ i. u o * t > i ( 1, -_ Aar. ' i ' s! / i " i i r' 1 (. 0 V,s-1.........................!..... - ~ ~ C''4C$'- 1 - z - Figure 2: The graphical interface The mouse is used both to operate the menuing system and to 'draw' quanta on the screen. For each quantum, four parameters must be specified (ignoring the imaginary part of magnitude). The time and frequency depend on the (x,y) position when the mouse is clicked, and the density and magnitude depend on where it is released. The system allows (arbitrarily) 21 molecules to be manipulated; both linear operations (e.g. multiplication, convolution) and non-linear operations (e.g. time-stretching, transposition) are selected from the menu. The graphical approach provides an intuitive way to compose but lacks the generality of a procedural language. The ideal may lie between these two forms, possibly along the lines of the Max interface. 7. Discussion Gabor wavelets, or quanta, offer much potential for synthesis. As well as being inexpensive to compute, they lend themselves to an attractive interpretation as elemental sonic entities. Another important advantage is that higher-level entities, such as melodies, timbres, scales, envelopes, filters, and reverberation, can also be expressed using the same paradigm. This contrasts with many synthesis methods where the 'score' and 'orchestra' are specified in completely different ways. However, the greatest promise in a quanta-based approach is that it appears to be as well suited to analysis as synthesis. Synthesis-by-analysis holds potential for computer musicians, computational musicologists, and music computationalists alike. References [Arfib 1991] Arfib, D., Analysis, transformation, and resynthesis of musical sounds with the help of a time-frequency representation, in De Poli, G. (Ed.), Representations of Musical Signals, Cambridge, Massachusetts, 1991 [Bastiaans 1985] Bastiaans, M.J., On the slidingwindow representation in digital signal processing, IEEE Trans. ASSP, 33(4), 1985 [Daubechies 1988] Daubechies, I., Orthonormal bases of compactly supported wavelets, Communications in Pure and Applied Mathematics, 41, 1988 [Gabor 1947] Gabor, D., Acoustical quanta and the theory of hearing. Nature, 159(4044), 1947 [Jones et al. 1987] Jones, D.L. and Parks, T.W., On computing equally spaced samples of a complex Gaussian function, IEEE Trans. ASSP, 35(10), 1987 [Kaiser 1987] Kaiser, J.F., On the fast generation of equally spaced values of the Gaussian function A exp(-at*t), IEEE Trans. ASSP, 35(10), 1987 N unn et al. 54 ICMC Proceedings 1996