Page  00000001 Parameter estimation for dual-polarization plucked string models Axel Nackaerts, Bart De Moor, Rudy Lauwereins Department Elektrotechniek-ESAT, Katholieke Universiteit Leuven email: Axel.Nackaerts@esat.kuleuven.ac.be Abstract L1 This paper describes an automated parameter extraction method LI for synthesis of plucked strings using digital waveguides. The method focuses on the determination of the parameters for L2 the two orthogonal polarizations. We determine the parame- H2 ters for the two polarization modes using a sub-band Hankel L L Singular Value Decomposition (sHSVD) algorithm. 1 Introduction The goal of this paper is to present a method for the accurate determination of the parameters for a digital waveguide model of a plucked string instrument (see Smith (1992) and Karjalainen et al. (1998) for an overview of waveguide models applied to plucked string instruments). The plucked-string waveguide model is flexible enough to generate high-quality sound. The estimation of the parameters, however, is not an easy task, especially if one considers the dual-polarization case. Envelope-based methods (Karjalainen et al. 1998; Bank 2000) are able to determine the parameters of the two polarizations but become inaccurate when either the frequency difference is small, the damping high, or there is a large amplitude difference between the two polarizations. To solve this, we use a modified Hankel Singular Value Decomposition to determine the system poles. The first part of the paper describes the algorithm itself, the second part presents results for synthetic and real guitar signals. 2 Parameter estimation We are looking for a method to determine the parameters of a given waveguide model such that its output closely matches a recorded tone. A simple dual-polarization waveguide model is shown in figure 1. The end reflections 7-1 and 7-2 are frequency-dependent. The delay line lengths L1 and L2 are equal or almost equal. In the actual implementation, the end reflections are linear filters consisting of an interpolator (to implement the fractional part of the lengths L1 and L2), an allpass filter (to implement the non-harmonic behavior of the partials) and a lowpass filter. Figure 1: The structure of the dual-polarization waveguide without coupling. If the lengths L1 and L2 are kept constant, the output signal has the form N y(t) = E Aie-at sin(wit + 0i) i=1 (1) where t represents time, N the number of partials, ai the damping and Ai the initial amplitude of each partial. This is an approximation that doesn't take into account the tension modulation non-linearity (Vilimiki, Tolonen, and Karjalainen 1999), nor string coupling (Weinreich 1977; Tolonen, Vilimiki, and Karjalainen 1998) but is still capable of high quality synthesis. We now consider a sampled version of this signal (sampling frequency F,), written as N s[n] = ciVin (2) i=1 in which c E C represents the initial phase, Vi = e-ai+jwi E C are the poles and n E N is the discrete time. One technique commonly used for the identification of exponentially damped sinusoids is based on the Hankel Singular Value Decomposition (HSVD) (Van Huffel et al. 1994). This method is analytically exact if the signal conforms to equation (1). We start by building a Hankel matrix from the signal of length Q S s(1) s(2)... s(Q/2) s(2) s(3) s(Q/2 + 1) s(Q/2 + 1) s(Q/2 + 2)... s(Q) (3)

Page  00000002 We then calculate the Singular Value Decomposition (SVD) of this matrix: H(Q/2+l)xQ/2 = U(Q/2+1)xN Z NxN * TNxQ/2 (4) for a signal with N < Q/2 partials. Note that when noise is present, the matrix E will be of size Q/2 x Q/2. If we only want M < N poles, we select the M largest singular values and corresponding singular vectors. This is equivalent to calculating a rank-M approximation of the signal. We now construct a matrix Z': Z' = (U)-1U (5) where Ut denotes the upper Q/2 rows of U and U4 the lower Q/2 rows. It is well known that the eigenvalues of Z' give estimates for the poles of the system (Van Huffel et al. 1994). eig(Z') =(VV2,I...IVN). (6) The estimates for wi and ai of the sampled signal s follow: Wi = (log(~i)), (7) ai = -R(1og(Vi)). (8) To accurately determine the poles in the presence of noise, a long signal (Q > 500) is needed. Due to the size of H this may lead to computational problems in the calculation of the SVD. These problems can be overcome by using a sub-band scheme that analyzes several critically sampled signals (with a reduced number of partials) instead of the original signal, as this replaces one large H by several smaller ones (Halder and Kailath 1997). Applying a short-time Fourier Transform (STFT) of length L to signal s yields us an L-band representation N L-1 Sm[k] = e-j27rmn/Ls[n + kL] (9) i=1 n=O where m = 0, 1,..., L - 1 and k the number of the time frame. Each sub-band can thus be written as a sum of exponentially damped sinusoids, and therefore conforms to equation (2). This implies that we can perform the HSVD algorithm on each subband of the STFT (one sub-band is the sequence of complex samples formed by the selection of one bin of the STFT over all the time frames). This is illustrated by figure 2. For each sub-band, the sampling frequency is F(m) = Fs/L. We obtain an estimate for the poles (Vm) in that sub-band. The damping and frequency for the original pole Vi of signal y relates to Vi/ as wi = iF(m) + bin base frequency (10) ai = &iF$m) (11) Using Least Squares, we can finally determine the phase qi and initial amplitudes Ai. This algorithm is called sub-band HSVD (sHSVD). f N H frame Figure 2: We reduce the complexity of the calculation by building an N-band representation and calculating the HSVD on each subband separately. As the number of partials is smaller than the number of bands, we only calculate the poles for the relevant bands. For the estimation of the parameters of the dual-polarization model of figure 1, we choose the window length of the STFT large enough such that only one pair of partials (one partial for each mode) can be found in one frequency bin. For each bin that contains partials, we use the entire time-frame sequence as the input for the sHSVD algorithm but calculate only two poles, one for each polarization mode. We obtain the estimates for wi and ti. The length of the delay lines L1 and L2 can easily be found given the knowledge of wi. Once these lengths are known, the reflection coefficient R is found as Ri = e-C (12) In the end, we know the reflection coefficients for both polarizations, at every partial. This complete set of reflection coefficients enables us to fit a linear filter that approximates the frequency-dependence. In the next section we apply this algorithm to estimate the model parameters for a synthetic signal and for a guitar recording. 3 Results A first test consisted of estimating the parameters of a known system. To synthesize the test signal, we used a dualpolarization waveguide model in which one polarization had a frequency-independent reflection coefficient and the other polarization had a lowpass reflection characteristic. White noise was added to simulate recording imperfections. The results can be seen in figure 3. For each partial, we get a pair of

Page  00000003 X- Y o-".0.... o... o o 0 0 0 o o. 0 X 0 0 0 Ox 0.96 - 0.94 - 0.92 - 0.9 0 Estimation 1 st polarization.... Analytic R=0.985 x Estimation 2d polarization Analytic R=0.995 + 9 owpass Frequency difference between two polarisations, normalised by the partial nr 0.35 0.25 -0.2 -0.15 - 0.1 - 0o- - - -- - - - - 0 500 1000 1500 2000 2500 3000 3500 Frequency (Hz) 4000 0 500 1000 1500 Frequency (Hz) 2000 2500 Figure 3: Estimated and analytic reflection coefficients for a dualpolarization waveguide model. The estimation is the result of the sHSVD analysis of a test signal where the first polarization had a reflection coefficient R=0.985 (the slight decrease at higher frequencies is caused by our implementation of fractional delay) and the second polarization had a reflection coefficient R=0.995 combined with a first-order lowpass filter (-3dB at 7 kHz). frequencies (fl,f2). These are then assigned to one polarization mode (horizontal or vertical) by grouping all the highest and all the lowest frequencies. This way, we get two series of frequencies that almost satify fl, 2fi, 3fi,... and f2, 2 x f2,3f2,.... The deviation of the measured series compared to the theoretic series is an indication of the inharmonic behavior. It is clear that our algorithm is capable of accurately separating the characteristics of the two polarizations. The second test was done on a recording of a nylon string guitar. The goal is to determine the parameters of the model shown in figure 1. Plotting the frequency difference of the partials, relative to the number of the partial, gives us information on the two polarization modes (figure 4). The fundamental frequencies have a 0.082 Hz difference, which is confirmed by the audible beating in the higher partials. This leads to the delay line lenghts L1 and L2, and the fractional delays. The inharmonicity was determined by calculating the difference between the ideal harmonic series and the measured frequencies, as can be seen on figure 5. The reflection coefficients for all the partials are shown in figure 6. We now determined the reflection filters 7-1 and 7-L2 of the model. The lowpass and allpass parts of the reflection filter were determined such that a good fit was obtained for both the required reflection coefficient (results from the amplitude of 7-L1 and 712) and inharmonicity (results from the phase). The model was excited with an impulse (comb-filtered to approximate the plucking position) and its output was filtered using the measured impulse response of the guitar body at the bridge. Figure 7 compares the time evolution of the fundamental of the recorded signal with the output of the dual-polarization model. Figure 8 are waterfall plots of the STFT of both signals. The differences at higher frequencies are probably due Figure 4: The frequency difference between the two polarizations divided by the number of the partial in the harmonic series. From this graph, we can conclude that the two polarizations have fundamental frequencies with a mean difference of 0.082 Hz 0 - - A\. \ X. \ '* o / \ -5- Ol" 100 -0020 -x- First polarization 0 S Second polarizatio -25 - q--y (Hz) - Frequency (Hz) 00 Figure 5: The frequency difference between the ideal harmonic series and the extracted frequencies. S' \..9, -~ \./ 0 I, O! 0 ).96 - '. I 1.95 - i. SI ' \ ' 1.94 I -:; 1.93 - ).92 -( >.91 - -x- First polarisation II O Second polarisation nq 0 500 1000 1500 Frequency (Hz) 2000 Figure 6: Estimated reflection coefficients for a dual-polarization waveguide model. The estimation is based on a recording of a low E on a Yamaha C70 nylon string guitar with AKG C4000B microphones (F, = 44.1kHz, 16 bit).

Page  00000004 x recording Sa. x simulation i 0.7 x 0.6 - x Us b 0.5 - k 0.2 - X 0.1, X............... x XXXxxx xxxxxxxxxxx xxxxxxxxxxx Time (s) Figure 7: Amplitude of the fundamental of the recorded guitar signal and simulation of the model shown in figure 1 with the parameters obtained with sHSVD. The model was excited with an impulse and its output was filtered using the impulse response of the guitar body. C 0............... S 100 200 3: 00 400 500 600 700 800 6 S700 800 8 Time (s) Figure 8: Short-time Fourier spectra of (top) the recorded signal and (bottom) the output of the calibrated dual-polarization model.r to differences in the excitations and the absence of coupling in the model. Frequency (Hz) tion mode. the K.U. Leuven. Rudy Lauwereins is Full Professor at the K.U. Leuven. This research work was carried out at the ESAT laboratory of the Katholieke Universiteit Leuven. This work is supported by the Flemish Government (Research Council KUL (GOA Mefisto-666, IDO), FWO (G.0256.97, G.0240.99, G.0115.01, Research communities ICCoS, ANMMM, PhD and postdoc grants), Bil.Int. Research Program, IWT (Eureka-1562 (Synopsis), Eureka-2063 (Impact), Eureka-2419 (FLiTE), STWW-Genprom, IWT project Soft4s, PhD grants)), Federal State (IUAP IV-02, IUAP IV-24, Durable development MD/01/024), EU (TMR-Alapades, TMR-Ernsi, TMRNiconet), Industrial contract research (ISMC, Data4s, Electrabel, Verhaert, Laborelec) and Texas Instruments (TI Elite program). ESAT-K.U.Leuven is a member of DSP Valley. The scientific responsibility is assumed by the authors. References Bank, B. (2000, May). Physics-based sound synthesis of the piano. Master's thesis, Budapest University of Technology and Economics. published as Report 54 of Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, http: //www.acoustics.hut. fi/ ~bbank/thesis. html. Halder, B. and T. Kailath (1997, Feb). Efficient estimation of closely spaced sinusoidal frequencies using subspace-based methods. IEEE Signal Processing Letters 4(2), 49-51. Karjalainen, M., V. Vilimiki, and T. Tolonen (1998, Fall). Plucked-string models: From the Karplus-Strong algorithm to digital waveguides and beyond. Computer Music Journal 22(3), 17-32. Smith, J. 0. (1992). Physical modeling using digital waveguides. Computer Music Journal 16(4), 74-91. Tolonen, T., V. Vilimiki, and M. Karjalainen (1998, June 8-11). A new sound synthesis structure for modeling the coupling of guitar strings. In Proc. IEEE Nordic Signal Processing Symp. (NORSIG'98), pp. 205-208. Vigs/o, Denmark. Valimiki, V., T. Tolonen, and M. Karjalainen (1999, March 15 -19). Plucked-string synthesis algorithms with tension modulation nonlinearity. In Proceedings of the IEEE International Conference in Acoustics, Speech, and Signal Processing (ICASSP 99), Volume 2, pp. 977-980. Phoenix, Arizona, http://www.acoustics.hut.fi/1vpv /publications/icassp99-tm.htm. Van Huffel, S., H. Chen, C. Decanniere, and P. Van Hecke (1994). Algorithm for time-domain NMR data fitting based on Total Least Squares. J. Magn. Reson. A110, 228-237. Weinreich, G. (1977, December). Coupled piano strings. Journal of the Ac. Soc. of Am. 62(6), 1474-1484. 5 Acknowledgement Axel Nackaerts is a Research Assistant with the I.W.T. at the Katholieke Universiteit Leuven. Bart De Moor is Full Professor at