Page  372 ï~~Control of Frequency and Decay in Oscillating Filters Using Multirate Techniques Frode Holm Department of Computer Science Center for Computer Music Research and Composition University of California Santa Barbara, CA 93106 holmcs. ucsb. edu Abstract Algorithms and theory for controlling fiequency, bandwidth and decay-rate in the Karplus-Strong string model are presented. The methods rely on resampling the delay-line and the loop-gain filter instead of the more commonly used filters for fractional-delay and decay-stretching/shortening. This leads to systems with markedly different characteristics from those where a fixed rate is used. The main focus of this paper is the new parameterization that the multirate view leads to. 1. Introduction Traditionally, control of frequency in oscillating filter structures, like e.g. in the Karplus-Strong (KS) algorithm [Karplus and Strong, 1983], has been accomplished through a combination of changing the length of the delay-line and the use of an FIR or hR tuning filter for obtaining fractional delays [Jaffe and Smith, 1983], [Karjalainen and Lane, 1991]. Various tradeoffs, usually linear phase vs. flat frequency response, are involved in choosing between FIR and HR filters for this purpose. Comparatively little attention has been given to the technique of using sample-rate conversion for achieving the same end, although a good example can be found in [Vercoe, 1988]. On the surface, the main difference between the two approaches is that in the multirate case the delaylength is kept constant, with frequency controlled through familiar table lookup techniques. An interpolation step in the form of an FIR filter is usually required in this case as well to obtain sample-values at non-integer positions. The associated tradeoffs now concern signal-to-noise ratios and aliasing rather than phase and spectrum issues, since this step does not modify the signal in the delay-line. These topics will not be discussed further in this paper (for an in-depth study of aliasing the reader is referred to [Vaidyanathan, 1993]). What is perhaps not so well known is that with the mutlirate method one gets an additional parameter for "f"re" - the length of the delay-line can now be used to control bandwidth and decayrate without introducing additional filters. This paper outlines how to construct practical algorithms from these considerations. In the following (except section 2), linear interpolation is always assumed for the rateconversion. For more accurate results, a method such as that proposed in [Smith and Gossett, 1984] should be considered. 2. Frequency domain representation It can be shown, [Oppenheim and Schafer, 1989] and [Crochiere and Rabiner, 1983], that if the output sample-rate is kept constant, the effect of changing the sample-rate of a signal by a rational factor f = MIL, is to scale its spectrum by this factor (assuming aliasing constraints are met). Thus, if we have a signal, represented by y(n), and resample it through y(nf) (assuming a perfect interpolation), the frequencies in y will be shifted accordingly. This is illustrated in fig.l. The gain factor c depends on the interpolation step used with XL (sample-rate increase). XL (e) _cX(eJ Â~L), Y(ejcÂ~) U XM(e""Â~) = XL (eJWIM) = ce lf ) Of course, this fact is well-known from the table lookup oscillator and commercial samplers, but what is important to note here is that this reasoning also applies to whole systems, including recursive ones [Holm, 1994]. If we take the classic K-S algorithm as an example and "resample" the Sound SynthesisTechniques 372 ICMC Proceedings 1994

Page  373 ï~~averaging filter, its frequency response, the first quadrant of a cosine, will be scaled as indicated in fig 3. -Ir I-., R F= f Hz, p+lt2 F(p+1/2) R (2) f. f1 i 1 / \ f f This is the same formula as the one used for calculating the frequency increment in a table lookup oscillator. In this picture, F is now independent of both p and R, since any changes in these quantities will be compensated for by f. Assuming a constant output sampling rate R, we now need to know what the effect of changing p has. To answer this, it is sufficient to consider the scaling of the frequency response of the K-S averaging filter byf. This filter falls off to 0 at the Nyquist rate, 13=R/2. By resampling, this frequency becomes 2 n -n f<1 (tD2fT I. i /..... 1 -2n -19 P 7 2n Fig.1. Frequency representation of sample-rate conversion by a factor off. f3 is cutofffreq. of yO. 3. Karplus-Strong revisited A straightforward way to implement K-S using sample-rate change, is to use table lookup to get output samples from the delay-line and for each pass through the delay, i.e. for each fundamental period, ensure that all the samples in the delay-line have been filtered. This can either be done concurrently with the lookup operation or as a separate step. The former is illustrated in fig.2 and an example of the latter can be found in the 'pluck' unit generator in CSOUND [Vercoe, 1988]. p 4n 4 y(O) y fn]}p) y.f(n- l)J ytfnJ) 1 1,Filter these samples at time n. Fig.2 Filtering in lock-step with lookup, f>1. In the original K-S algorithm the fundamental frequency, F. is given by (with sampling rate R and delay length p): R F(p+l/2) 2 (3) Fig.3. Change offilter response by resampling. Freq. response of averaging filter is/fcos(rtFlR)f. Freq. response of resampled filter is /cos(r.FRf)[. f>I p p Fig.4. Filter response after 5 periods. It now becomes clear that by varying p, first of all, the intial bandwidth of the resulting signal is affected. A larger value of p means a wider filter response applied to the signal used to fill the delay buffer at start-up time, usually white noise. Furthermore, as this filter is applied over and over in the course of oscillation, higher frequencies take a longer time to decay the larger p is (and vica versa). This is readily verified by contemplating fig.3 & 4 and eq. (3). It may be a fortuitous circumstance that this parameter has a behaviour that corresponds so well with the physical reality observed in strings and then, maybe not (physical modelling gums: take note!). The harder the pluck R F,= Hz p+l1/2 (1) With resampling by a factor of f, we get ICMC Proceedings 1994 373 Sound SynthesisTechniques

Page  374 ï~~(by increasing p), the more initial high-frequency energy and the longer the decay of the harmonics - the more gentle the pluck (lower p) becomes, the more the opposite is true. In [Moore 1990] the decay characteristics for the basic K-S algorithm is tabulated. To obtain the decay values in the multirate view, the same reasoning as that detailed in [Jaffe and Smith, 1983] can be applied, bearing in mind the required variable changes. The attentuation factor a(t) is given by:.,tFp a(t) = cos( )P+112 Rf (4) Scos( n/p)tF Values for the time it takes the fundamental to decay -40dB is listed in table I for various values of p and F. pOF 50 100 500 1000 30 17.6 8.8 1.8 0.9 50 48.0 24.0 4.8 2.4 100 189.4 94.7 18.9 9.5 200 752.1 376.1 75.2 37.6 Table 1.40 dB decay in sec. for fundamental F. Since changing p determines bandwidth and decay-rate in (3) given F, the same must also be true for the fundamental frequency F if p is kept constant. In a practical implementation p must consequently be adjusted with pitch accordingly to obtain consistent timbres. 4. Variations on a theme. By extending the reasoning and methods suggested above it is possible to obtain more degrees of freedom with only a negligible increase in computational costs. In particular, it is straightforward to separate F completely from the control of the decay-rate (but not the initial bandwidth). This can be done by introducing an additional parameter, G Hz, which specifies how many times per second the filter must complete a loop. The rate at which the low-pass filter is applied, and consequently the speed 01 the decay, is now an independent quantity. In pseudo-C such an algorithm could look something like the following (g is the table-increment associated with G). Please note that the quoted statements encapsulates a whole range of operations concerning modulus table-access, interpolation, possible anti-aliasing filters etc. that would have to be addressed in a real implementation. for (n=O; n < Duration; n++) { 1/ Compute filter at rate g. // See fig.2. fl() = flooro. for (i=fl ((n-i) *g); i<fl (n*g); i++) "y[i] = 0.5*(y[i-p]+y[i-p-l])"; // Compute output at rate f "output y[n*f]"; } There is only one slight complication to this scheme, which is due to the fact that the averaging filter delays the signal period by a 1/2 sample each time through the loop. This must be compensated for in calculating the frequency increment f in order to avoid a "doppler" effect. For the two-point averaging filter, eq. (2) must be recast to: Gl1 f=F(p+----)IR F 2 (5) i.e. the delay in the buffer due to the filter is now determined by the relative speeds of the two operations, expressed by the ratio GIF. To use this algorithm in a practical setting, one would first choose p, given a fundamental F, to set the desired initial bandwidth, then choose an appropriate value for G to set the decay-rate according to (4) with G replacing F. One point that some might object to, is the fact that one period of the output waveform now could contain a mixture of 2 or even more "generations" of the filter. For realistic settings of the various parameters involved, this does not seem to have any objectionable audible consequences (that I can hear), but some caution might be in order. 5. Efficiency issues In the original K-S implementation, the computational cost is fixed, given a sampling rate R. With multirate techniques this is no longer true, as the loop-filter in this case operates independently ofR. Since an operation comparable to linear interpolation at R times per sec. is involved for frequency control in both cases, efficiency for the two methods, without decay control, will be identical when R=F(p+112). Depending on how bright one needs the string to Sound SynthesisTechniques 374 ICMC Proceedings 1994

Page  375 ï~~sound, the cost with multirate methods will be either less (low bandwidth) or more (high bandwidth). If we add in decay-control, the balance shifts somewhat in favor of the multirate case since the only added cost here is the loss of some minor optimization possiblities, whereas with original KS, the loop-filter becomes twice as expensive [Jaffe and Smith, 1983]. Now the two methods will be approximately equal in cost when 2R=G(p+112). Again, depending on the desired decay-rate value, the multirate method may be more or less efficient. 6. Extensions It remains to be seen how useful multirate techniques can be when considering larger and more complex systems, such as those found in complete instrument or vocal tract simulations. The parts of such systems that control stationary formants will most likely not benefit much from resampling since, as we have seen, this would move the spectral shape of the affected filters. Still, successful use of such methods in the physical modelling field have been reported, [Wu, 1987] and [Wright and Owens 1993], and it is likely that more examples and theory will be forthcoming in the near future. 7. Summary Methods for controlling frequency, bandwidth and decay-rate in the Karplus-Strong string synthesis model using multirate algorithms were presented. It was shown how this leads to a different parameterization of the model and how this makes filters for tuning and decay-shortening/stretching unnecessary, thus eliminating problems like nonlinear phase and amplitude response that is not flat. Instead issues of table-lookup noise and aliasing have to be addressed. Two possible algorithms based on these considerations were outlined. It was also argued that, on average, the two methods have the same computational cost, the exact measure depending on actual parametersettings which determine number of operations per sec. in the multirate case. One method is not necessarily '%etter" than another, except in situations where application constraints may favor one over the other. References [Crochiere and Rabiner, 1983] Ronald E. Crochiere and Lawrence R. Rabiner. Multirate Digital Signal Processing. Prentice Hall, Englewood Cliffs, N.J., 1983. [Holm, 1994] Frode Holm. Ph.D. thesis in progress. Dept. of Computer Science UCSB, 1994. [Jaffe and Smith, 1983] David Jaffe and Julius O. Smith. "Extensions of the Karplus-Strong Plucked String Algorithm". Computer Music Journal, 7(2) pp. 43-55, 1983. The Music Machine, Roads, C.,ed., MIT Press, 1989. [Karjalainen and Laine, 1991] Matti Karjalainen and Unto K.Laine. "A Model for Real-time Sound Synthesis of Guitar on a Floating-point Signal Processor". Proc. Int. Conf. Acoustics, Speech & Signal Processing (ICASSP '91) pp 3653-3656, 1991. [Karplus and Strong, 1983] Kevin Karplus and Alex Strong. "Digital Synthesis of PluckedString and Drum Timbres". Computer Music Journal, 7(2) pp. 43-55, 1983. The Music Machine, Roads, C.,ed., MIT Press, 1989. [Moore, 1990] F. Richard Moore. Elements of Computer Music. Prentice Hall, Englewood Cliffs, NJ., 1990. [Oppenheim and Schafer, 1989] Alan V. Oppenheim and Ronald W. Schafer. Discretetime Signal Processing. Prentice Hall, Englewood Cliffs, N.J., 1989. [Smith and Gossett, 1984] Julius O. Smith and P. Gossett. "A Flexible Sampling-Rate Conversion Method". Proc. Int. Conf. Acoustics, Speech & Signal Processing (ICASSP '84) pp 19.4.1-19.4.2, 1984. [Vercoe, 1988] Barry Vercoe. CSOUND manual pp.28 and 'ugens4.c' source code. Media Laboratory, M.I.T., 1988. [Vaidyanathan, 1993] P. P. Vaidyanathan. Multirate Systems and Filter Banks. Prentice Hall, Englewood Cliffs, NJ., 1993. [Wright and Owens, 1993] G.T.H. Wright and FJ.Owens. "An optimized Multirate Sampling Technique for the Dynamic Variation of Vocal Tract Length in the Kelly-Lochbaum Speech Synthesis Model". IE:E Trans. on Speech and Audio Processing, pp.109-113, 1993. [Wu, 1987] H.Y.Wu, P. Badin, Y.M. Cheng and B. Guerin. "Vocal Tract Simulation: Implementation of Continuous Variations of the Length in a Kelly-lochbaum Model Effects of Area Function Spatial Sampling". Proc. nt. Conf. Acoustics, Speech & Signal Processing (ICASSP '87) pp 1.4.1-1.4.4, 1987. ICMC Proceedings 1994 375 Sound SynthesisTechniques

Page  376 ï~~Connections between Feedback Delay Networks and Waveguide Networks for Digital Reverberation Julius 0. Smith Center for Computer Research in Music and Acoustics (CCRMA) Stanford University Email: Davide Rocchesso Centro di Sonologia Computazionale Dipartimento di Elettronica e Informatica University degli Studi di Padova Email: Abstract The order N feedback delay network (FDN) has been proposed for digital reverberation. Also proposed with similar advantages is the digital waveguide network (DWN). This paper notes that the FDN is isomorphic to a (normalized) waveguide network consisting of one (parallel) scattering junction joining N reflectively terminated branches. This correspondence gives rise to new generalizations in both cases. The feedback delay network (FDN), depicted in Fig. 1, has been proposed for digital reverberation applications (see the companion paper [Rocchesso and Smith, 1994] for references). These structures are characterized by a set of delay lines connected in a feedback loop through a "feedback matrix." a2,1 a2,2 a2,3 a]3,1 a3,2 a3,3 D t eg t s(n) h DW ma i ey t b _ms s$ n [c - Z d t Figure 1: Order 3 Feedback delay network. Digital waveguide networks (DWN) have also been proposed as a starting point for digital reverberator development [Smith, 1985]. Like FDNs, DWNs make it easy to construct high-order lossless systems. Ordinarily, the lossless prototype reverb erat or is judged for the quality of white noise it generates in response to an impulse signal. For smooth reverberation, the white noise should sound uniform in every respect. Subsequent introduction of lowpass filters into the prototype network (e.g., applied to junction pressure) serves to set the desired reverberation time vs. frequency. Since FDNs and DWNs appear to present very different approaches for constructing lossless prototypes, it is natural to ask what connections may exist between them, and whether there may be unique advantages of one over the other. Figure 2 illustrates an N-branch DWN which is structurally equivalent to an N-th order FDN. The waves traveling into the junction are associated with the FDN delay line outputs si(n), and the length of each waveguide is half the length of the corresponding FDN delay line mi (since a traveling wave must traverse the branch twice to complete a round trip from the junction to the termination and back). When mi is odd, we may replace the reflecting termination by a unit-sample delay, or we may define the branch medium such that the speed of propagation is slightly faster in one direction. s1(, 7 S) AN(n+mN) a,(n) SN(n) Figure 2: Waveguide network consisting of a single scattering junction to which N branches are connected. The far end of each branch is terminated with a perfect, non-inverting reflection, indicated by a black dot. As discussed in greater detail in the companion paper, the delay-line inputs (outgoing traveling waves) are computed by multiplying the delayline outputs (incoming traveling waves) by the Nby-N feedback matrix A = [ai,J: N si(n + mi) = E ai,jsj(n) j=1 The above notation coincides with that used in the companion paper. By defining p+ = st(n), Sound SynthesisTechniques 376 ICMC Proceedings 1994

Page  377 ï~~p= = si(n + mi), and A = [a.,j], we obtain the more usual DWN notation p_- =Ap+ where p+ is the vector of incoming traveling-wave samples arriving at the junction at time n, pis the vector of outgoing traveling-wave samples leaving the junction at time n, and A is the scattering matrix associated with the junction. To obtain lossless FDNs, prior work has focused on unitary feedback matrices. A matrix A is said to be unitary if A*A = I, where '.' denotes transposition and complex conjugation. Since every lossless scattering junction provides a lossless FDN matrix, are all of these matrices unitary? The answer is immediately no: Unitary scattering matrices arise only in the case of normalized waves, e.g., pressure waves which are multiplied by the square-root of the wave admittance of the waveguide in which they travel [Smith, 1987]. In such cases, the scattering matrix can be expressed as a Householder reflection A = 2T/1III2-I, where _T = [ /'T,..., v'r, and 1i is the wave admittance in the ith waveguide branch. Unnormalized scattering junctions can be expressed in the form of an "oblique" Householder reflection A = 21pT/ (_T, I_) - I, where 1_T [1,...,1] and rT = [r1,...,FN]. Thus, p+ is reflected about 1 and scaled based on its "shadow" along [. From these forms, we see that all junctions of N physical waveguides require only O(N) computations and thus do not span all lossless scattering matrices without further generalization. What are all lossless scattering matrices? From basic physical principles, a scattering matrix is lossless if and only if the total active complex power is scattering-invariant, i.e., if p+*rp+ = p-*rpA*rA = r (1) where I' is a Hermitian, positive-definite matrix which can be interpreted as a generalized junction admittance. For unitary A, we have I' = I. In the case of N traveling pressure waves scattering at a "parallel" junction, we obtain r = diag(I1,..., PN). Unless all branch admittances are identical, the scattering matrix is never unitary. In general, the Cholesky factorization r' = U*U gives an upper triangular matrix U which converts A to a unitary matrix via similarity transformation: A*rA = r =~ A*U*UA = U*U =. A = I, where A = UAU-1. Hence, the eigenvalues of every lossless scattering matrix lie on the unit circle. When U is diagonal, a physical waveguide interpretation always exists with U = diag([). A generalized waveguide interpretation exists for all U via "power equivalent junctions" [Smith, 19871 in which U acts as an ideal transformer (in the classical network theory sense) on the the vector of all N waveguide variables. It readily follows from similarity to A that A admits N linearly independent eigenvectors. Conversely, assume IAI = 1 for each eigenvalue of A, and that there exists a matrix T of linearly independent eigenvectors of A. Then the matrix T diagonalizes A to give T-'AT = D =Â~, T*A*T-* = D*, where D = diag(A1,...,AN). Multiplying, we obtain T*A*T-*T-lAT = D*D = I = A*T-*T-'A = T-*T-1. Thus, Eq. (1) is satisfied for r = T-*T-l which is Hermitian and positive definite. We may summarize as follows: Theorem: A scattering matrix (FDN feedback matrix) A is lossless if and only if its eigenvalues lie on the unit circle and its eigenvectors are linearly independent. Thus, lossless scattering matrices may be fully parametrized as A = T-1DT, where D is any unit-modulus diagonal matrix, and T is any invertible matrix. It can be quickly verified that all scattering matrices arising from the intersection of N physical waveguides possess one eigenvalue equal to 1 (corresponding to all incoming waves being equal) and N - 1 eigenvalues equal to -1 (corresponding to equal incoming waves on N - I branches, and a large opposite wave on the remaining branch which pulls the junction pressure to zero). Since only a subset of all N-by-N unitary matrices is given by a physical junction of N digital waveguides, (e.g., consider permutation matrices), the FDN point of view yields lossless systems outside the scope of single-junction waveguide networks. On the other hand, since only normalized waveguide junctions exhibit unitary scattering matrices, the DWN approach gives rise to new classes of lossless FDNs. Moreover, by considering more than one scattering junction, the DWN approach suggests a far larger class of lossless network topologies. References [Smith, 1985] J.O. Smith. A New Approach to Digital Reverberation using Closed Waveguide Networks, ICMC-85, Vancouver. CORMA STAN-M-31. [Smith, 1987] J.O. Smith. Music Applications of Digital Waveguides, Report STAN-M-39, CORMA, 1987. (Contains [Smith, ICMC-85J.) [Rocchesso and Smith, 1994] D. Rocchesso and J.O. Smith (companion paper). Circulant Feedback Delay Networks for Sound Synthesis and Processing, ICMC94, Aarhus, Denmark. ICMC Proceedings 1994 377 Sound Synthesis Techniques