Page  493 ï~~Optimisations of the FOF Algorithm for VLSI Implementation J. R. Spanier J.R.Spanier@durham.ac.uk S. Johnson Simon.Johnson@durham.ac.uk Durham Music Technology Group School of Engineering University of Durham Durham DH 1 3LE, UK A. Purvis Alan.Purvis @durham.ac.uk Abstract In this paper we investigate the FOF wave function approach to parallel formant synthesis, with the aim of implementing the algorithm in silicon. We present a hybrid wavetable/filter structure, which provides the benefits of the analytical expression for the FOF generator and the computational efficiencies of filter topologies. Finally the FOF synthesis engine communication protocol with a host processor is addressed. 1 Introduction 2 FOF Optimisations The Forne d'Onde Formantique (FOF) algorithm is a source-filter synthesis model capable of emulating the singing voice [Rodet, 1984]. The source is simply an impulse train and the filter is a modified decaying sinusoid (a formant). In this paper we will investigate the FOF wave function approach to parallel formant synthesis, with the aim of implementing the algorithm in silicon. The analytical expression for the formant wavefunction is the following t)-1 (1- cos("t))e- t sin(t)Ot + 0<t _ <p (1) where ax = 7t bandwidth and (0 = 2tfc. The two most common implementations of FOF are wavetable based; as found in CHANT and CSOUND and filter-based; as found in CHANT and The Samson Box. The problem with the wavetable based approach as applied to VLSI implementation, is the unbounded nature of the computation and storage requirements. A single chip silicon based engine requires all computation to be computed within a sampling period and storage should be on chip. The filter-based approach is more amenable to silicon, but suffers from the simplification of the FOF wave-function to a simple second order filter. To overcome this difficulty requires a more complex filter having properties similar to the FOF wavefunction. We now describe a hybrid architecture which makes use of the beneficial aspects of the two approaches and avoids the problems described. In this section we will discuss the optimisations necessary for VLSI implementation in terms of the source (the impulse generator) and the filter (the formant). 2.1 The Impulse Generator The periodic impulses used in FOF are driven by a pitch parameter and each formant region has its own centre frequency, but this frequency is modulated via the pitch. The algorithm requires accurate control of the impulse generator such that if the impulse falls within a sample period, the wavefunction is advanced by the time difference between the impulse firing and the next sample interval. Failure to take this into account will result in poor quality synthesis. Therefore the synthesis technique is operating at a high sampling rate. Therefore we must generate the impulses at a highsampling rate and down sample the output to the correct audio sampling rate. This technique is required in most speech synthesis algorithms [Klatt, 1980]. In this application, we create a phase accumulator having a single-bit output. This bit latches a register which holds the amplitude of the impulse if the bit is one, otherwise it outputs zero. Using this simple scheme, octaviation [Clarke et al., 1988] and simple filter scaling can be achieved. The output is then passed through a decimator. 2.2 Filter Bank Structures There are two ways of describing a formant as a filter and they are the following: ICMC Proceedings 1996 493 Spanier et al.

Page  494 ï~~" Implement the filter with an oscillatory component. " Use the filter as an amplitude shaper and then heterodyne upto the formant frequency. Similar methods exist for speech synthesisers [Linggard, 1985]. The latter provides manipulation of the formant frequency with no filter instabilities, whereas the former is computationally efficient. One technique that has been investigated approximates the FOF waveshape [Phillips, 1995] to the function t2e-C' sin(wot). This is achieved by convolving two rectangular windows, two sinusoids and a decaying sinusoid together. The resulting output is shown in figure 1. In the time-domain the waveform appears to 1 0.8 0.6 0.4 0.2 g0 E -0.2 -0.4 -0.6 -0.8 -1 000000 0I,, 0, 0, 00Â~ I 0 0 o o, o o, 0 0000 00 -1 -0.5 0 Real part 0.5 1 Figure 2: Pole-Zero plot of FOF Filter x 10 2.5 -2 -1.5 0.5 0 -0.5 -1.5 -2 -.5, MNyWM"^ 1, I I I, 0 200 400 600 Sample Index 800 1000 1200 Figure 1: Impulse Response describing t2eat sin(wot) be a good approximation to the FOF structure. However, on further analysis we found that the frequency response is not smooth and the oscillation is directly coupled to the window length. Both of these characteristics are undesirable for a FOF filter. One approach to preserving the spectral properties of the wavefunction is to z-transform the analytical expression. Further, we must guarantee that the filter will either have a truncated response [Depalle et al., 1992] or decay to zero. The latter response is desirable for a filter based architecture. This truncation can be achieved by cancellation of the filter response after a predetermined time by using pole-zero cancellation techniques. One approach is to use tail-canceling IIR filters [Wang and Smith, 1994], but for this application the automatic approach to tail IIR filter calculation was unsuitable. An empirical form was found to satisfy the requirements. Equation 2 describes an exponential decay within a Hanning window. Figure 2 shows the pole-zero plot of this filter for a window size of 20 samples and a bandwidth of IkHz. This filter topology meets the FOF requirements except when the bandwidth is very small, hence a proper FOF envelope is then required. Long decay times are useful in generating bell-like tones. A more complex filter has been designed and is very similar to equation 2 but requires two extra multiplies and delay elements. All the filters discussed requires some scaling, especially if implemented in integer arithmetic, to overcome errors caused by the wordlength and arithmetic used. This is when the impulse generator amplitude selector becomes beneficial. 2.3 Formant Sine Generator In the previous sections, we have discussed the envelope components of a FOF generator and we now address the oscillatory part. In the proposed algorithm we multiply the output of the FOF envelope, generated by eqn 2 by a sinusoidal generator. A simple approach to the sinusoidal generator would be to use a table lookup design, enabling independent control of the formant frequencies. However in silicon, this would require a relatively large ROM to store the sine waveform. A better approach is to compute the sine function using the CORDIC algorithm [Hu, 1992], which is fed by a phase accumulator. This topology can be multiplexed, uses a small ROM and is completely stable, unlike filter based oscillators. The sinusoid generator also provides the amplitude envelope by utilising another multiplier or by summing two sinusoids before being fed into the (heterodyning) multiplier. Â~-I+ea -e2cxlT Z-2a-1 _ e2 3T e-aT Z-213 H (z) = ( -TZ I(IÂ~2-T ~ -l+e2aT Z2) I (2) Spanier et al. 494 ICMC Proceedings 1996

Page  495 ï~~3 Internal Control Subsystem It is well known that altering filter coefficients whilst the filter is still active will cause instability. The approach taken in our work, is to implement at least 4 filter/oscillator combinations per FOF oscillator. Each of these combinations represent an equivalent FOF wavefunction, as found in CSOUND. Previous FOF implementations, most notably the Samson Box [Rodet et al., 1984], have been based on second order bandpass sections requiring coefficient interpolation and the problem of de-phasing the filters against one another. Our approach is to freeze the coefficients at initialisation and update them at a slow rate. Our algorithm allows control over the skirt-width and initial phase of the formant oscillator and thus overcoming the problem of zeros in the spectrum. Each combination filter-oscillator has a sample counter which is used to tell the system the availability of the resources. The counter is implemented as a decrementing counter, which is reset to the initial value (the sample length of the filter) by the arrival of a new impulse and signals the allocation controller when decremented to zero. The allocation control subsystem which controls the FOF sub-bank oscillators is driven by a linked list. Provided there are adequate resources available, on arrival of an impulse, the chosen sub-bank is activated. The list is updated to the next free resource and any new parameters are loaded into it. If all resources are used up, then the system will block all parameter updates except for amplitude and formant frequency. As soon as one of the resource's counter has been decremented, it releases itself from the linked list and becomes available. The allocation unit is implemented on chip and exists for each formant. The block diagram of the FOF architecture is shown in figure 3. Sub-banks 1-3 Sinusoidal FOF - Oscillator Fliter i Topology Impulse, Generator,~ Bk IDiagnostic Regster ) 'I, I.' Linked List FOF Filter-Based Wavefunction 4 External Control Interface The FOF engine discussed requires a simple interface to a host microprocessor. To this end, a memory mapped scheme is proposed. This memory map contains the impulse generator increment and offset and octaviation controls, which are common to the voice. All parameters for each FOF formant are also accessible. Consequently the host interface hides the sub-bank oscillator structures from the user. It is also useful-to have some feedback of the internal controller and thus an output register is provided. The host input registers are double buffered, to allow the DSP engine to take precedence over the data. 5 Conclusion In this paper we have described a wavetable-like FOF VLSI engine which is bounded in memory and computation. We have overcome the problems associated with filter-based FOF algorithms by using a more complex filter structure. The control aspects of the algorithm for VLSI implementation are addressed. The next stage of this work is to simulate the algorithm on a TMS320C40 DSP chip to test the algorithm and then progress to silicon design, testing and implementation. References [Clarke et al., 1988] Clarke, J., Manning, P., Berry, R., and Purvis, A. (1988). VOCEL: New implementations of the FOF synthesis method. In Proceedings of the Computers Music Conference, pages 333-348, Cologne, W-Germany. [Depalle et al., 1992] Depalle, P., Matignon, D., and Stroppa, M. (1992). Source-filter formulation and analytic control of the skirtwidth of CHANT formant-wave-functions. In Proceedings of the hIternational Computer Music Conference, pages 372-373, San Jose, California, USA. [Hu, 1992] Hu, Y. (1992). CORDIC-based VLSI architectures for digital signal processing. IEEE Signal Processing Magazine, pages 16-35. [Klatt, 1980] Klatt, D. (1980). Software for a cascade/parallel formant synthesizer. Journal of the Acoustic Society of America, 67(3):971-995. [Linggard, 1985] Linggard, R. (1985). Electronic Synthesis of Speech. Cambridge University Press. [Phillips, 1995] Phillips, D. (1995). Matlab FOF script and E-Mail correspondence. [Rodet, 1984] Rodet, X. (1984). Time domain formant-wavefunction syntheis. Computer Music Journal, 8(3):9-14. [Rodet et al., 1984] Rodet, X., Potard, Y., and Barri~re, J.-B. (1984). The CHANT project:From the synthesis of the singing voice to synthesis in general. Computer Music Journal, 8(3): 15-31. [Wang and Smith, 1994] Wang, A.-C. and Smith, J. (1994). On fast FIR filters implemented as tail-canceling IIR filters. TR Stan-M90, CCRMA, Dept. of Music, Stanford University, Stanford, CA 94305-8180. Obtained from ftp:l/ccrma-ftp.stanford.edu. Figure 3: FOF Formant Block ICMC Proceedings 1996 495 Spanier et al.