Page  00000484 Higher order Ambisonic systems for the spatialisation of sound. Malham, D. G., [email protected] Music Technology Group, Department of Music The University of York York YO10 SDD UK Abstract: Ambisonic systems have been in use for the diffusion of electroacoustic music for a decade or more. Using four channels of audio to feed a larger number of loudspeakers, they provide the composer with significant advantages over conventionally panned multispeaker arrays. However, the current first order Ambisonic systems can be significantly improved upon by going to higher order systems. Recent advances in multichannel computer audio systems have made it economically viable to upgrade to second order. This step requires a maximum often channels of audio. Existing theoretical studies of second order Ambisonics have concentrated on two dimensional, horizontal plane only systems. This paper covers the design and implementation of fully three dimensional systems. The differences between partial second order systems and full second order systems are examined. Loudspeaker system requirements are indicated as are techniques for dealing with the problems of combining second order Ambisonics with other approaches. 1 Spatialisation systems for electroacoustic music In computer games, in cinema, television and multimedia productions or in audio recording the commercial use of sound spatialisation with a higher degree of complexity than that of simple stereo is becoming more and more prevalent. However, the desire by some composers, particularly those of electroacoustic and computer music, to develop ways in which the spatial aspects of the performance of music can be enhanced and exploited as part of their musical language long predates these commercial applications Since the very earliest days of the genre when Pierre Schaeffer, Jacques Poullin and Pierre Henri were working in Paris [1 ][2] and, of course, even before that [3], electroacoustic composers have sought technological solutions to sound spatialisation which would also satisfy their artistic criteria. Some have taken the view that the technology itself should be part of the musical experience which has lead to the development of loudspeaker orchestras such as Beast or the Acousmonium which are performance instruments in their own right Others have sought to remove the delivery technology from the musical equation as far as possible, wanting the loudspeaker array to be a simple, transparent medium through which the sound can flow. To approach this ideal requires the loudspeaker system to be eliminated from the perception of the audience which in turn implies that the loudspeakers need to be treated as system for producing a bulk soundfield in space, rather than as individual sources of sound which may be perceivable as such. For reasons which have been extensively discussed elsewhere [4][5][6], systems based on an extension of stereo, such as qua4d cinema style surround and other ad-hoc amplitude panned systems cannot be regarded as generally suitable for this, though they maybe so in particular cases. Such special cases may, of course, be regarded as a form of the 'system as instrument' approach mentioned above At present, there are only three technologies available or under development which can even partially achieve the aim of reducing the perceptual significance of the system itself. They are; Wave field synthesis [7] * Holophony [8] * Ambisonics although a fourth, hyper dense transducer array technology, may become practicable in two or three decades as costs drop, much as those of personal computers have. In a hyper dense array, the audience is surrounded by sufficient transducers to ensure that the human sonic perceptual field, which has a maximum angular resolution of 1~ at best [9], is oversampled sufficiently to meet the Nyquist criterion. The number required can be approximated as follows provided a, the angular separation in radians, is small enough; The transducers can be considered as occupying a circular area on the surface of the sphere generated by the intersection of the conical angle a with the surface. The radius r of such a circle is a and if the angle is small enough, the area A of the circle approximates to; As the surface area of a unit sphere is 4 As the surface area of a unit sphere is 47 and-the -484 - ICMC Proceedings 1999

Page  00000485 packing efficiency of circles on a surface is %/4 for square packing, the number of transducers that can be packed in this fashion is approximated by; S4ffY^- 4iNote that there are a number of approximations in this, not least in that there are more efficient packing schemes, but for small values of the conic angle, it gives at least an indication of the order of magnitude of the required numbers of transducers, although it does tend to underestimate somewhat. If the system is oversampled by a factor of 2, the number of transducers required is over 160,000. For now, though, this remains an impracticable system because of the costs and the technological difficulties of providing sufficiently flexible and artistically useful compositional and control systems. Of the other three systems named above, two are based more or less directly on the Huygens Principle which states that a propagating wavefront may be regarded as consisting of a large (ideally infinite) number of secondary sources. Unfortunately, these systems also require very large numbers of transducers and complex control systems if they are to function up to the limits of the human hearing system. For instance,-if the wavefront is to be adequately sampled up to 20kHz, the transducers must be separated by no more than the half wavelength distance of 8.5mm. We are therefore left with the Ambisonic system. This is not, in the strictest sense, a bulk soundfield system, rather it tries to get the soundfield exactly right at one central location while allowing a gradual increase in the level of errors as the listener moves away from the centre [10]. If carefully implemented, it is well known to work effectively even over large areas [11][12][13]. Recently, combined Ambisonic and Holophonic [14] or Ambisonic and Wavefield synthesis (15] approaches have been put forward as offering the best features of both, although still without the ease of implementation and control that Ambisonics can offer. 2 First and second order Ambisonic systems Ambisonic directional encoding is based on the well known ability of spherical harmonics to efficiently define a function on the surface of a sphere [16] which has resulted in their extensive use in problems in Physics and Chemistry. Michael Gerzon first suggested using them for the description of soundfields in a paper in 1973 [17] in which he defined systems up to third order. This was presented in cartesian coordinates rather than the polar cordinates which have been standard for all Ambisonic systems since the formal introduction of the terminology in 1975 [18]. The four first order components were defined as; X = cos0 cosD.S (front - back) Y = sin ecos 0.S (left-right) Z= sin c.S W = 0- 707.S (up - down) (pressure signal) where S is the sound, 8 is its anti-clockwise angle from centre front and 0 is its elevation. In 1995, Bamford [19] proposed two additional signals, U and V to take it to second order in the horizontal plane only. These can be modified slightly for 3-D work, leaving the vertical plane as first order, thus forming an order 1.5 system; U = cos 2 cos D.S V = sin 20 cos 0.S A partial system of order 1.5 concentrates the extra resolution in the horizontal plane where our directional hearing is most acute. As it uses only six channels of audio output (or seven- see later) it can be implemented with the current generation of eight channel audio cards and recorders. A further three signals and modifications to U and V are required for full second order. As mentioned above, Gerzon's original 1973 [17] paper defined the spherical harmonics up to third order, which requires a total of 16 channels. In this paper he defines them in terms of cartesian coordinates (x,y,z) where x is the front-back axis, y is the left-right axis and z is the up-down axis whereas in his later published work he deals only with first order systems and in which the definitions are given in polar (r,8,D)) coordinates. This has left us with neither a defined terminology for higher order systems nor a spherical harmonic formulation for the higher order channels in a form which is consistent with the current practise for the first order ones. Unfortunately, each of the groups of workers - physicists, chemists, mathematicians - using spherical harmonics have their own style and none seem to match current Ambisonic practise. Accordingly, using the notation in Kaplan [16], with all multipliers which remain constant over the surface of the sphere normalised to 1 and all similarly constant signals subsumed in the W signal, we here define the second order functions as follows; R = sin 20 S= cosGcos220 T= sin e cos220 U = cos2~ - cos20 sin 2A V= sin 2~ - sin 2 sin2(0 The polar patterns are much more complicated than the simple omnidirectional pattern of the zero'th order component or the figure of eight of the first order channels (figure 1) even for the R channel. When second order components are included in the ICMC Proceedings 1999 -485 -

Page  00000486 Zeroth order (W) First order (Z) Second Order Figure 1 Spherical harmonics system, there is an increase in the Source Directivity Response (SDR) of the signal fed to each speaker in the array compared to the situation where only first order components are involved (Figure 2). This considerably enhances the area over which localisation based on energy works effectively ( as opposed to wavefront reconstruction which is only relevant near the centre of the array). For horizontal only speaker arrays this has been $ ~ 3 Practical aspects of second order systems There are a number of challenges which must be faced when using second order Ambisonic systems. Firstly, and perhaps least importantly, because of the more abrupt amplitude profiling, second order systems will generate a lower maximum sound pressure than first order systems over the same loudspeaker array. Secondly, the minimum number of speakers is higher, six being an absolute minimum for horizontal only work as compared to four for first order and twelve as opposed to eight for full 3-D, though eight and twenty respectively are, in fact, better figures to take as the minima for concert hall work. The major challenge is, however, the use of mixed second and first order source materials. This comes about because the only commercially available microphone suitable for directly capturing sound Ambisonically is the Soundfield microphone which is limited to first order components. The signal matrix necessary to feed the correct portions of each of the spherical harmonic components to each speaker - the decoder - is substantially different for signals with only first order components and those with second order as well. Whilst this can be accommodated by weighting the components appropriately at the directional coding and mixing stage, the weighting is dependent on the type of decoder (first order only, or first and second) which will be used. This is unfortunate since one of the main attractions of Ambisonics is that the composer does not need to have any knowledge of the nature of the speaker array whilst producing the piece, differences between arrays being handled by the decoder. The solution we have settled on is to provide two rather than one version of the W (or pressure) signal, one with signal component levels appropriate to the full first/second order replay situation, the other for first only, these signals being W' and W respectively. The precise details will be published later after further listening tests, but we can state that for an optimum ratio of front to back on the SDR, signals with second order components will have W components around 39% lower than those with first order components only. 4 Conclusions Fully three dimensional Ambisonic sound spatialisation systems of order 1, 1.5 and 2 have been discussed and the equations for the relevant channel gains have been presented. Some practical problems have been indicated and solutions have been presented. This system offers significantly enhanced spatialisation technologies for composers of electroacoustic music, with new creative possibilities opening up to anyone with the appropriate number of audio channels available on their computer systems. In order to make these possibilities easily accessible, it is strongly urged that all those developing or -- F rst order componemt only.- First ad second order Figure 2 SDR's for first and second order decoding confirmed by the elegant statistical analysis recently published by Daniel et al [20] and by preliminary informal listening tests. The algorithms involved in manipulating the spatial characteristics of sounds directionally encoded to second order precision are not significantly more difficult than for the first order case and compositional tools developed for first order systems can be relatively easily adapted for second order use, with the unfortunate exception of CSound which is currently limited to four ouput channels in the public version 3.46 [21] although this can be circumvented by making multiple passes and then editing the resulting soundfiles together. -486 - ICMC Proceedings 1999

Page  00000487 maintaining computer based tools for the manipulation of audio make the support of unlimited numbers of audio channels an urgent priority within their systems. 5 References [1] Schaeffer, P. (1951) "Journal d'Orph6e" in Bayle, F. ed.: Pierre Schaeffer l'oevre musicale, France 1990, INA.GRM and Librairie SEGUIER. [2] Poullin, Jacques "The application of recording techniques to the production of new musical materials and forms. Application to 'musique concrete"' National Research Council of Canada Technical Translation TT-646. Originally appeared in French in L'Onde flectrique, 34 (324): 282-291, 1954 [3] Varese, Edgar "Var6se envisions 'Space' Symphonies" New York Times, 6th December, 1936. Quoted in "Edgar Varsee" by Fernand Oullette, Calder and Boyars, London 1973, p.84 [4] Malham D.G. 'Ambisonics - A technique for low cost, high precision, three dimensional s6und diffusion', ICMC Glasgow 1990 Proceedings, pp.118-120 [5] Malham, D.G. and Myatt, A. "3-D Sound Spatialization using Ambisonic Techniques" Computer Music Journal, 19;4, pp 58-70, Winter 1995 [6] Malham, D.G. "Homogeneous and nonhomogeneous surround sound systems" proceedings, AES UK 'Audio: the second century' Conference, pp25-34, London, 7-8 June 1999 [7] Boone, M.M., Verheijen, E.N.G. and Jansen, G. 1996 " Virtual Reality by Sound Reproduction Based on Wave Field Synthesis" 100t1 Convention of the Audio Engineering Society, preprint 4145, Copenhagen, May 1996 [8] Nicol, R. and Emerit, M. 1998 " Reproducing 3D-Sound for Video Conferencing: A Comparison Between Holophony and Ambisonic. Proceedings of the First COST-G6 Workshop on Digital Audio Effects (DAFX98) Barcelona November 1998, pp17-20 http://www.iua.upf.es/dafx98/ [9] Stybel, T.Z. Manglias, C.L., and Perrot, D.R. (1992) "Minimum Audible Movement Angle as a Function of The Azimuth and Elevation of the Source" Human Factors, Vol. 34, pp 267-275, quoted in Begault 1994 page 50. [10] Bamford, J.S. and Vanderkooy, J. "Ambisonic sound for us" Preprint No. 4138 presented at the 99th. AES convention, New York, 1995 [11] Malham, D.G. and Orton, R. 'Progress in the Application of Ambisonic Three Dimensional Sound Diffusion Technology to Computer Music' ICMC Montreal 1991 Proceedings, pp. 467-470 [12] Malham, D.G. "Experience with Large Area 3 -D Ambisonic sound systems" In Proceedings of the Institute of Acoustics. 14[5]209-216. St. Albans: Institute of Acoustics. November 1992 [13] Vennonen, K. 1994. "A Practical System for Three-Dimensional Sound Projection" In Proceedings of the Symposium on Computer Animation and Computer Music, Synaesthetica '94, Australian Centre for the Arts and Technology, Canberra, Australia. [14] Nicol, R. and Emerit, M. 1999 "3D-Sound Reproduction over an Extensive Listening Area: a Hybrid Method Derived from Holophony and Ambisonic" The Audio Engineering Society 16th International Conference on Spatial Sound Reproduction, Helsinki 1999, preprint no. s66819 [15] Horbach U., Boone, M.M., 1999 "Future Transmission and Rendering Formats for Multichannel Sound" The Audio Engineering Society 16th International Conference on Spatial Sound Reproduction, Helsinki 1999, preprint no.s59711 [16] Kaplan, W. "Advanced mathematics for engineers" Addison-Wesley, Reading 1981, pp710-714 [17] Gerzon, M. A. "Periphony: With-height Sound Reproduction" Journal of the Audio Engineering Society, Vol. 21 No. 1 Jan/Feb 1973 pp.2-10 [18] Gerzon, M.A. 'Ambisonics Part two: Studio Techniques' Studio Sound, August 1975. pp24 -30 [19] Bamford, J.S. "An Analysis of Ambisonic Sound Systems of First and Second Order" MSc in Physics Thesis submitted to the University of Waterloo, Waterloo, Ontario, Canada 1995 [20] Daniel, J6r6me, Rault, Jean-Bernard and Polack, Jean-Dominique 1998 'Ambisonics Encoding of Other Audio Formats for Multiple Listening Conditions', preprint no. 4795, 105th Audio Engineering Society Convention, Sept. 1998,. (Corrected version available by contacting the authors at Centre Commun d'Etudes de Tdl -diffusion et T6l6communications, Cesson Sdvignd, France) [21] Vercoe, Barry "The Csound Manual" version 3.48 Media Lab M.I.T.Edited by Jean Pichd, Universitd de Montreal ICMC Proceedings 1999 -487 -