Page  222 ï~~Sine Circuitu 10,000 high quality sine waves without detours Cor Jansen APPLICA Koolemans Beynenstraat 54 NL-6521 EW Nijmegen The Netherlands Abstract Additive (Fourier) synthesis is one of the most powerful synthesis models, since it allows the creation and subtle control of a rich variety of timbres. Current systems for this kind of synthesis have two major drawbacks: high cost and problems with the management of the large number of control parameters needed. An additive synthesis system without these drawbacks is presented, capable of generating up to ten thousand high quality sine waves in real-time. The management problem is solved by allowing the user to design a sound by interconnecting modules in a graphic editor. These modules exist of low level signal processing operations like adders and multipliers, as well as high level audio treatments such as modulators, filters, reverberators and pitch shifters. All modules define operations on frequency spectra. By working in the frequency domain, the user can define control parameters which have intuitive perceptual effect. The user interactively manipulates the sound with graphical controls on the screen, or with external control devices. The problem of high cost is eliminated by using state of the art technology. A number of transputers is used to perform the operations of the modules in real-time, and the resulting spectra are generated using special hardware. Introduction The main purpose of today's synthesis methods seems to be the creation of sound using as few parameters as possible, but still allowing a large variation in timbre. However, the more a parameter influences timbre, the more it reveals the characteristics of the synthesis model. And worse, in some methods parameter changes have no intuitive perceptual effect. Additive synthesis does not have these drawbacks. Adding a large number of amplitude- and frequency-controlled sinusoids theoretically allows the creation of any sound imaginable, but the large number of control parameters is known to be one of its major problems. Additive synthesis has the advantage that a corresponding analysis method is known, which translates audio signals (time domain) into a set of sinusoidal components (frequency domain). These components can be manipulated before being fed into the additive synthesis system. Manipulations in the frequency domain often require less computational power than the corresponding manipulations in the time domain. An obvious example is filtering. Very complex dynamically changing filters can be implemented easily in the frequency domain, while in the time domain it will be difficult or even impossible to do. The user of the Sine Circuitu need not be aware of the details of the additive synthesis method since familiar modules like filters, envelope generators, wave form generators, delay lines, and modulators can be used. Requirements Theoretically a periodic (harmonic) sound with a bandwidth of 16 KHz, and with a fundamental frequency of 55 Hz, will take up to 290 sinusoids to generate. Normally the amplitude of most of the higher harmonics will be below the perceptual threshold, and the actual number of sinusoids can be made much smaller. According to Samson, we need 20 to 30 sinusoids to generate the sound of one string of a bowed instrument [Samson 1985]. Moorer uses 21 sinusoids at most to represent cello, clarinet and trumpet tones [Moorer 1977]. If we wish to generate noise using additive synthesis, the actual number of sinusoids required depends on the bandwidth and the duration of the noise. According to Raadgever, we can fill the desired bandwidth with equal spaced sinusoids at 16 Hz intervals to avoid interference which is manifest as audible tones ICMC 222

Page  223 ï~~[Raadgever 1989]. For very short noise bursts this may be satisfying. For longer durations the distance between two adjacent sinusoids should be much smaller because of the audible repetition rate of the resulting noise. Randomizing, or modulating the frequency or amplitude of the sinusoids slightly, might overcome this problem. High quality sound, comparable to the quality of compact disk technology, implies a sample rate of 44.1 KHz or higher, and a dynamic range of about 96 dB or more. To get such a dynamic range we need to present the output signal in at least 16 bits, although 16 bits for each sinusoid is not needed. A listening test at CCRMA showed that a 4K by 12 bit sine table was perceived just as distortion-free as a 64 K by 16 bit table [Snell 1977]. A table of 4K by 12 bits corresponds to a signal to noise ratio of 60.4 dB [Moore 1977]. However, adding a large number of sinusoids within in small frequency range, the accumulated noise is not masked by the sinusoids because of the different spectral characteristics. Therefore we use a larger sine table, to get a better signal to noise ratio for each sinusoid. According to Rakowski it is possible, under ideal listening conditions, to notice a pitch change of between 0.03 Hz and 0.08 Hz around 160 Hz [Rakowski 1971]. However, if two sinusoids with frequencies close to each other interfere, they will introduce a beat. The frequency of this beat (the difference in frequency of the two sinusoids) must be controllable in inaudible steps. Therefore a resolution of less than 0.03 Hz is recommended. It is commonly quoted that changes in sounds faster than about 1 ms are indistinguishable. However, when using an update rate of 1 KHz, theoretically one should filter the envelope to get rid of the high frequency components. Using an envelope update rate lower than the sample rate without filtering, can cause an annoying "chirp" [Strawn e.a. 1985]. An alternative to the expensive low pass filtering is linear interpolation between pairs of envelope break points. Hardware design The 'Sine Circuitu', is modular. It consists of a number of double extended eurocard sized modules, communicating with each other via a special bus (SCBUS), see figure below. A minimal system consists of two modules, the controller (SCCON) and the sine generator (SCGEN). The controller module handles the communication. It contains MIDI, RS232, RS422, SCSI interfaces and digital optical audio input and outputs which can operate at 32KHz, 44.1 KHz or 48KHz. SCBUS (o0 RS422 (Apple Talk) BDCON t RS232 (geneal purpose) control boardMIDI (4in, 1 out) AES/EBU optical audio outputs (4 channels) AES/EBU optical audio inputs (2 channels) ICMC 223

Page  224 ï~~In addition the controller module holds up to 16 transputers to calculate the spectra in real-time. It sends the information for the generation of the sinusoids via the SCBUS to the sine generator modules. A sine generator module is capable of generating 625 sinusoids with linear interpolated frequency and amplitude envelopes at 44.1 KHz sample rate. The envelopes are controlled by an on-board transputer. The output of this module is send back in the form of packets holding a variable number of sinusoids to the controller module which has the necessary i/o hardware (e.g. digital audio output). A system can contain up to 17 modules on the backplane. These are for instance 1 controller and 16 sine generator modules, yielding 625*16 = 10000 sinusoids at 44.1 KHz sample rate. Sine-generator The basic principle of the sine-generator is rather straightforward (see figure below). It shows some similarity with the Digital Oscillator design of John Snell [Snell 1977]. sCBus SCBUS By using special pipeline techniques, Integrated Circuits of the newest technology and surface mounting devices (SMD), it is possible to generate and accumulate one sinusoid each 35 nsec, with a maximum of 1024 sinusoids. In these 35 nsec, a number of computations take place in parallel. A new amplitude-envelope value, frequency-envelope value and a new index for the sine-wave table are calculated, a new sine-wave value is read, a sine-wave value is multiplied by the amplitude, and the resulting sinusoid is accumulated. The output of the Sine-generator consists of 1 to 128 packets each consisting of 1 or more sinusoids. Each of the sinusoids can be allocated to one of the packets, which enables us to dynamically allocate 'free' sinusoids to the packets where they are needed most. The output consists of 24-bit signals. Internally the sinusoids are accumulated with 40 bit accuracy to minimize accumulation of roundoff and ICMC 224

Page  225 ï~~quantization errors, and to avoid overflow errors if a signal temporarily can not be presented within the 24-bit outputs. The amplitudes and changes in amplitudes are represented by 24-bit and 20 -bit values respectively to make very slow and fast envelope slopes possible, although 16 bits are used for the multiplication with the sine-wave. The frequency is represented by 28-bit values, giving a frequency resolution of 0.00016 Hz.The envelope break point resolution of the amplitude and frequency envelops is about 0.7 ms. The size of the sine wave table is 32K words of 16 bits giving a signal to noise ratio of about 78.4 dB [Moore 1977] for individual sinewaves. Software design The basic idea is to allow the user to manipulate sound by graphically interconnecting modules which operate on frequency spectra and control some control buttons, slides etc. [Desain 1986]. Just a small set of basic modules will be needed. With this set, more complex modules can be created and connected in an hierarchical structure. The transputers on the control module perform the operations of these modules in real-time and send the resulting spectra (frequency and amplitude envelopes) to the sine-generator(s). A scheme of automatic process migration ensures an even balance of the computational load of the transputers. An analysis method will be provided as an input module to enable manipulation of recorded sounds. Status and availability The hardware as described above has been developed and built, with exception of the SCSI interface. Some low-level software has been written to make the system functioning. At this moment it is possible to generate sound by downloading frequency and amplitude envelopes. The transputer software that manipulates the frequency spectra, as well as the graphic editor are under development. An analysis tool based on the phase vocoder is under development as well, and will be available next year. Conclusion This paper presents a way of generating thousands of sine waves for additive synthesis, in realtime. We think that additive synthesis can only fully mature when the two major disadvantages are eliminated: costs and controllability. We hope that the Sine Circuitu will contribute to that. Acknowledgements Much of this paper was worked out in conjunction with Peter Desain. Especially within the software development, Peter's ideas have been a great help. References Desain, P. 1986 Graphical Programming in Computer Music, a Proposal, Proceedings International Computer Music Conference 1986. Moore, F. R. 1977 Table Lookup Noise for Sinusoidal Digital Oscillators, Computer Music Journal 1(2):26-29. Moorer, J. A. 1977 Signal Processing aspect of Computer Music: A Survey, Proceedings of the IEEE 65:1108-1137. Raadgever, J. 1989 Institute of Acoustical Perception, Technical University Delft, Personal Communication. Rakowski, A. 1971 Pitch discrimination at the threshold of hearing, Proceedings of the Seventh International Congress on Acoustics, vol. 3. Samson, P. R. 1985 Architectural issues in the design of the system concepts digital synthesizer in: Strawn, J.(ed) Digital audio engineering: An anthology. Los Alamos, Calif.: Kaufman. Snell, J. 1977 Design of a Digital Oscillator That Will Generate up to 256 Low-Distortion Sine Waves in Real Time, Computer Music Joumal 1 (2):4-25. Strawn, J. and Roads, C3. 1985 Overview of part II in: Strawni, J. and Roads, C. (eds) Foundations of Computer Music: 191-205. ICMC 225