Page  00000001 SUBJECTIVE EVALUATION OF THE INHARMONICITY OF SYNTHETIC PIANO TONES Francesco Scalcon (1), Davide Rocchesso (2), and Gianpaolo Borin (3) (1) Generalmusic S.p.A. - via delle Rose, 12 47048 S. Giovanni in Marignano - Italy - francescos@generalmusic.com (2) Universita di Verona - Dipartimento Scientifico e Tecnologico Strada Le Grazie - 37134 Verona - Italy - rocchesso@sci.univr.it (3) Universita di Padova - Centro di Sonologia Computazionale via S.Francesco, 11 - 35121 Padova - Italy - borin@dei.unipd.it ABSTRACT The influence of accurate reproduction of inharmonicity on the perceived quality of piano tones was investigated. Acoustic piano tones were resynthesized by changing the bandwidth of correct positioning of partials. Out of that bandwidth the partials were kept at a constant distance. Cutoff frequencies for different pitches could be established, and these results are applicable to the design of dispersive resonators in sound synthesis by physical modeling. 1. INTRODUCTION The strings of acoustic pianos are dispersive and, as a consequence, piano tones are inharmonic. While the physical behavior of stiff strings is well understood [Morse 91, Podlesak and Lee 88], the perceived effects of inharmonicity on the quality of piano tones have not been studied thoroughly. Podlesak and Lee pointed out the importance of pitch glides due to inharmonicity of low bass tones such as the AO [Podlesak and Lee 89]. However, our experience showed that inharmonicity plays an important role during the sound decay, even for notes much higher than AO, whose glides are too short to be discriminated. After many discussions with pianists, we realized that the main problem with harmonic synthetic piano tones is a lack of liveliness in the decay stage. Therefore, we decided to focus on the decay of piano tones. Since our curiosity on the aural perception of inharmonicity arose after intensive experimentation on physical modeling of the piano [Borin et al. 97], we restricted the field of investigation even further, by taking the needs of modeling into account. In physical modeling of strings it is customary to lump losses and dispersion, and simulate them by means of a couple of linear filters. Prior work showed that the allpass filter used for dispersion simulation has to be of very high order if the dispersion curve of low-pitched notes has to be approximated with good accuracy. High-order allpass filters are difficult to design and expensive for implementation, thus calling for psychoacoustically-driven design criteria. The first fundamental question was if there is a preferred bandwidth for dispersion approximation, i.e. a cutoff frequency above which it doesn't make sense to approximate the correct positioning of partials. This paper deals with this question and gives an answer motivated by a psychoacoustical experiment. 2. EXPERIMENT 2.1. Stimuli The stimuli were produced by means of additive synthesis, in such a way that detailed control on the position of each partial is possible. Additive synthesis was driven by results of analysis of the sounds produced by an acoustic piano Schulze-Pollmann 190-F. Namely, the notes C1, C2, C3, C4, and C5 were played fortissimo in an anechoic chamber and recorded with a microphone positioned where the head of the player is supposed to be. The analysis was performed by using the Short-Time Fourier Transform and isolating the partials from the "stochastic" part of sounds. Instantaneous frequency and amplitude data for all the significant partials were extracted and stored for being used during resynthesis.

Page  00000002 Five sets of tones were synthesized: CnADD Resynthesis without modification of the extracted data CnNFM Resynthesis after elimination of frequency fluctuations of partials CnBT Resynthesis without frequency fluctuations and with positions of partials set according to the theoretical curve [Podlesak and Lee 88] derived from measured physical properties of the strings CnBE Resynthesis without frequency fluctuations and with positions of partials set according to the theoretical curve adapted to fit the measured partial frequencies. Cnf Resynthesis equal to CnBE up to a cutoff frequency f. The partials higher than f are positioned at points regularly spaced in frequency. A value of f > 7700Hz has to be interpreted as such that all the partials follow the dispersed theoretical positioning. The only exception to these rules is found for the C1, where the measured partial positions do not follow the theoretical curve very closely. A good fit for the C1 was found as given by the physics-driven theoretical positioning, and the resulting tone was named ClBT. Two other tones were synthesized using an inharmonicity coefficient increased by 10% and 20% respectively. In order to improve the frequency resolution for C1 while keeping reasonably-sized time windows (4096 points), we had to reduce the sampling rate from the nominal F, = 48000Hz to F, = 16742Hz. As an example, fig. l.a shows the measured percentage dispersion (i.e. deviation from harmonic position) for partials of the note C2 and two theoretical curves obtained by using the inharmonicity coefficient derived from physical data and from point fitting, respectively. Fig. 1.b shows the dispersion of partials of resynthesized tones C2f; the cutoff points are clearly visible as the partial number where the dotted curves stop increasing. The cutoff frequencies f were spaced according to critical bands, at intervals of 2 barks. A potential bias for the experiment might occur if no limit is put on the bandwidth of synthesized tone. In fact, differently-dispersed versions of the same note with the same number of partials can have different bandwidths, and this phenomenon can deceive the listener who might rank two tones differently simply because they have a different bandwidth. Therefore, a constant upper limit (represented by the dashed line in fig. 1.b) was put on the resynthesized partials, and no partial having frequency higher than that limit was reproduced. (a) (b) 25 25 ý320- 1 20 -Figure 1: (a) measured percent dispersion for partials of the note C2 and two theoretical curves obtained by using physical data and from point fitting, respectively. (b) dispersion of partials of resynthesized tones C2f. 2.2. Method The tones described in section 2.1 were synthesized and digitally recorded on a Digital Audio Tape. All the subjects were requested to listen to the tape by means of headphones. A preliminary test was conducted in order to ascertain the accuracy of the synthesized tones in reproducing the perceived effects of inharmonicity. For each note, the recorded tone and the three stimuli CnADD, CTINFM, and CnBE were played in sequence. These tones represent increasing degrees of simplification in the reproduction of acoustic piano tones. The subjects were requested to focus on the decay stage and to describe any differences in timbre. The answers confirmed that the simplifications don't alter the perceived effects of inharmonicity. Another pretest was needed to ascertain whether the choice of the inharmonicity coefficient is critical or not. To this end the stimuli CnTT and CnBE were played one right after the other, and the subjects had to describe perceived differences in the decay stage of sound. Answers indicated that there are only small differences between different dispersion curves, thus showing that the choice of the inharmonicity coefficient is not critical.

Page  00000003 The main experiment was conducted by playing the tones Cnf in sequence, using a random permutation of cutoff frequencies f. The subjects had to locate each tone on a (linear) scale of perceived naturalness, ranging from good/natural to unacceptable/unnatural. This operation consisted in drawing a cross along a 61mm-long segment. The subjects were advised to focus on the decay stage of sounds and to neglect possible differences in the attacks. These subtle differences were difficult to eliminate as they are a byproduct of resynthesis. The chosen subjects are all italian-speaking musicians accustomed to analyzing timbre in all its multiple facets. For each note, each subject had to evaluate all the resyntheses under different cutoff frequencies. Eight subjects were used for all the notes except for C1, which was synthesized and tested a few weeks later. Only six subjects could be used for note C1. 2.3. Results All the subjects reported that the task of the main experiment was not easy. Nevertheless, they said that it is possible to rank the sound with a good degree of certainty. In figures 2, 3.a and 3.b we report the means over all subjects and the standard deviations for the various cutoff frequencies of every note. An evaluation close to 0 indicates a sound decay perceived as natural, while a value close to 61 indicates an unnatural sound decay. All the notes, except the C5, give a clear indication of an increasing mean perceived quality versus increasing cutoff frequency. The reliability of the reported evidence is confirmed by the Analysis of Variance, reported on table 1. It shows that the null hypothesis (i.e. the cutoff frequency doesn't affect the perceived quality of timbre) is fulfilled with very small probability for the notes C2 and C3. For notes C1 and C4 the null hypothesis is fulfilled with about 20% probability and for note C5 this hypothesis is quite likely, thus indicating that reported behavior of C5 does not show a significant dependence on cutoff frequency. The note C1 was the most difficult to analyze and resynthesize, due to its low pitch, and the group of subjects was smaller than for the other notes. If these difficulties were overcome, we are confident that the reliability of results for note C1 would increase significantly. Note df F-ratio p-value crit. F-ratio (0.05) C1 4 1.7523 0.1702 2.7587 C2 4 10.1846 0.0000 2.6415 C3 4 10.4444 0.0000 2.6415 C4 5 1.4941 0.2123 2.4377 C5 5 0.9967 0.4315 2.4377 Table 1: Analysis of variance for the perceived quality of re-synthesized tones The experiment confirms a phenomenon that has been intuitively known for a long time: the effect of inharmonicity is mainly important on the lower half of the keyboard. For notes such as the C5 the correct positioning of partials is not essential for a good reproduction. Figures 2, 3.a and 3.b depict graphs of measured mean ratings versus cutoff frequency. For the tests where the ANOVA gives a low (< 21%) probability of rejection of the null hypothesis, the curves are monotonic, thus indicating that better results are obtained for higher numbers of accurately-positioned partials. If we set the threshold of acceptability at the middle of the vertical scale, the frequency where the curves cross the middle horizontal line can be chosen as a bound of correct positioning of partials. In other words, when synthesizing piano-like tones, we have to position with good accuracy at least the partials which are below that point. Even though the bandwidth of perceived inharmonicity is smaller for lower notes, many more partials have to be positioned properly for low bass notes. For example, according to our experiment for the note C1, having about 1700 Hz as the bandwidth of perceived inharmonicity, about 50 partial frequencies have to be approximated. On the other hand, for the note C3 about 30 partials will suffice. The information brought by this experiment are valuable for designing the allpass filters needed by physical models of dispersive resonators. The effectiveness of design is significantly improved if the design procedure refines its approximations in the frequency areas where high precision is needed. 3. SUMMARY We have investigated the ability of musicians to rank artificial piano tones according to the accuracy of their inharmonic distribution of partials. We have found that, for any note, there is an upper bound above which the partials do not need

Page  00000004 (a> Mean perceived naturalness Vs cutoff frequency: 61 (a) ~(b)() (C> Mean perceived naturalness Vs cutoff frequency: 63 I I I I I I I I I I I I I I I I Figure 2: Graphs of measured mean ratings versus cutoff frequency. Notes Cl, C2, and C3. (a) (b) (c) bpi ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Figure 3: (a) and (b): Graphs of measured mean ratings versus cutoff frequency. Notes C4 and C5. (c) Bandwidth of perceived inharmonicity as a function of fundamental frequency. to follow the prescribed distribution. The main result of this experiment can be summarized as a sparse set of points giving the bandwidth of perceived inharmonicity of piano tones as a function of fundamental pitch. Fig. 3.c displays these findings, which can be used to drive the allpass filter design procedures toward zones of better significance. The first three points of fig. 3.c were drawn on the basis of frequencies where the corresponding curves of fig. 2 cross the line of middle vertical scale. The fourth point was selected qualitatively, since the corresponding curve of fig. 3.a was all below the middle of the vertical scale. 3.1. Acknowledgment This research has been partially developed under the auspices of Generalmusic S.p.A.. 4. REFERENCES [Morse 91] P. M. Morse, Vibration and Sound. New York: American Institute of Physics for the Acoustical Society of America, 1991. 1st ed. 1936, 2nd ed~. 1948. [Podlesak and Lee 88] M. Podlesak and A. R. Lee, "Dispersion of Waves in Piano Strings," J. Acoustical Soc. of America, vol. 83, pp. 305-3 17, Jan. 1980. [Podlesak and Lee 89] M. Podlesak and A. R. Lee, "Effect of Inharmonicity on the Aural Perception of Initial Transients in Low Bass Tones," Acustica, vol. 68, pp. 61-66, Jan. 1989. [Bonin et al. 97] G. Bonin, D. Rocchesso, and F. Scalcon, "A Physical Piano Model for Music Performance," in Proc. Internzational Computer Music Conference, (Thessaloniki, Greece), pp. 350-353, ICMA, Sept. 1997.