IMPORTANCE OF INHARMONICITY IN THE ACOUSTIC GUITARSkip other details (including permanent urls, DOI, citation information)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. Please contact firstname.lastname@example.org to use this work in a way not covered by the license. :
For more information, read Michigan Publishing's access and usage policy.
Page 00000001 IMPORTANCE OF INHARMONICITY IN THE ACOUSTIC GUITAR Hanna Jdrveldinen and Matti Karjalainen Helsinki University of technology Laboratory of Acoustics and Audio Signal Processing ABSTRACT Audibility of inharmonicity in realistic acoustic guitar tones was studied through formal listening experiments. Test tones were synthesized using an FZ-ARMA model of acoustic guitar sounds, which allows accurate control of the inharmonic partials. Inharmonicity was detected fairly easily for the lowest strings if the plucking transient was left out of the sound, but much harder for the full sound with transient. Detection thresholds were measured for two pitches at the low range of the guitar. Mean thresholds were close to, though above, typical amounts of inharmonicity in the guitar. Implications to digital sound synthesis are discussed. 1. INTRODUCTION Inharmonicity in string isntruments is an interesting question for sound synthesis. In the low register of the piano, inharmonicity adds warmth to the sound (Fletcher, Blackham and Stratton 1962), and accurate simulation of inharmonicity in frequency bands up to 1.5 - 4 kHz is considered important in synthesis of piano tones (Rocchesso and Scalcon 1999). Whether inharmonicity affects sound quality in the acoustic guitar, is the main question in the current study. In real strings, stiffness contributes to the restoring force along with tension, which results in slightly inharmonic partials. The partial frequencies fn are given by (Fletcher et al. 1962): fn = nfoV1+Bn2 (1) B = 73Qd4/6412T (2) where n is partial number and B is inharmonicity coefficient, composed of Young's modulus Q, diameter d, length I and tension T of the string. It is seen from (1) that higher partials are stretched relatively more than lower ones. Thus even a small B value can produce a prominent amount of mistuning in the higher partials of a lowpitched tone. Inharmonicity coefficients were measured from recorded guitar tones for the free strings and the first 17 frets. The inharmonicity coefficients increased with fundamental frequency and fret number. Audibility thresholds were measured in (JTrveliinen, Vilimýki and Karjalainen 2001) for synthetic string instrument -like tones as a function of fundamental frequency. This threshold is presented in Fig. 1 together with the measured inharmonicity coefficients for 10-3 10-4 0. __- Threshold 0 -String 1 SString 2 A String 3 String 4. String 5 ~ String 6 10-5 10-6 102 103 Fundamental frequency [Hz] Figure 1. Audibility thresholds from (Jirveliinen et al., 2001) (solid line) versus inharmonicity coefficients B for frets 0-17 of each string as a function of fundamental frequency. real guitar tones. In all tones inharmonicity clearly exceeds the audibility threshold. The coefficients varied for string one between 2.9... 4.6 times B at the threshold, for string two 9.7... 17.1, for string 3 34.5... 58.3, for string 4 7.8... 15.4, for string 5 13.1... 22.1, and for string 6 34.4... 77.6 times B at the respective audibility threshold. Thus inharmonicity should be most clearly audible for strings 3 and 6 and probably also for other strings. While the above mentioned study was made using generic string sounds with identical decay times for each harmonic, the present study carefully focuses on the detailed sounds of a high-quality classical acoustic guitar. Therefore, the previous results may not fully predict the current case. Inharmonicity is perceived in many ways. Slightly inharmonic tones differ from respective harmonic tones by a timbral effect. Increasing inharmonicity increases the pitch of the tone. In highly inharmonic tones some of the partials may segregate from the complex and be heard separately (Schneider 2000). In string instruments, inharmonicity causes the timbral effect and possibly a pitch effect. No pitch change was observed in our test stimuli in initial testing, however. 2. MANIPULATION OF HARMONICITY To modify the inharmonicity of a guitar sound the frequencies of the partials have to be modified without changing any other perceptual features of the sound. This is a
Page 00000002 40 - 0-1 9~F-szH Figure 2. Spectrum of plucked string sound of a classical acoustic guitar, open string B4. Spectrum zoomed to a single partial is shown in the subplot. demanding task and should be based on careful parametric modeling of recorded sounds. One method that could be used is sinusoidal plus transient modeling modeling, because it has a closer correspondence to the physical basis of the instrument sounds. Each partial is modeled as a set of modes that can be scaled in frequency for resynthesis, and the original residual is added to yield a new version of the sound with modified harmonic structure. 2.1. High-Resolution Mode Analysis If the response of a string to excitation were an AR (autoregressive) process, it could be analyzed by straightforward AR modeling techniques, such as the linear prediction (LP) (Makhoul 1975), resulting in an all-pole filter. This is not manageable in practice because the pole positions close to the unit circle makes the analysis numerically too critical, and the allocation of the number of poles for each partial of the string cannot be easily controlled. It is important to notice that the two polarizations of string vibration with slighly different frequencies make each partial to be a sum of two decaying sinusoids, exhibiting beating or two-stage decay of the envelope (Fletcher et al. 1962) or two separate spectral peaks as illustrated in Fig. 2. This requires at least two pole pairs to represent one partial. Since the string response to excitation is not a minimumphase signal, the AR modeling approach is even in theory a wrong tool. ARMA (autoregressive moving average) models are capable of matching such responses, but due to iterative solutions, models that are higher in order than 20-200 may not converge to a stable and useful result. 2.2. FZ-ARMA analysis To avoid the problems in resolution and computational precision discussed above, we have developed a subband technique called FZ-ARMA (frequency-zooming ARMA) analysis (Karjalainen, Esquef, Antsalo and V~ilim~iki 2002). Instead of a single model, global over the entire frequency range, the signal is pole-zero modeled in subbands, i.e., by zooming to a small enough band at a time, thus allowing a filter order low enough and individually selectable to each subband. This helps in resolving resonant modes that are very sharp or close to each other in frequency. The FZ-ARMA analysis consists of the following steps. (i) Select a frequency range of interest, e.g. a few Hz wide frequency region around the spectral peak of a partial. (ii) Modulate the target signal (shift in frequency by multiplying with a complex exponential) to place the center of the frequency band, defined in (i), at the origin of the frequency axis by mapping hm(12) =ej2lh (3) where h(n) is the original sampled signal, hm(12) the down-modulated one, n is the sample index, and Q2 the (normalized) modulation frequency. This rotates the poles of transfer function by Q2i,7~ct j 2 - Sim. (iii) Apply lowpass filtering to the complex-valued modulated signal in order to attenuate its spectral content outside the zoomed band of interest. (iv) Decimate (down-sample) the lowpass filtered signal according to its new bandwidth. This zooms system poles zi by Zi,zoom = i~o, ~l,,,~,-,K,, (4) where Kzoom, is the zooming factor, and Zi zoom are the mapped poles in the zoomed frequency domain. (v) Estimate an ARMA (pole-zero) model for the obtained decimated signal in the zoomed frequency domain. For this we have applied the iterative SteiglitzMcBride algorithm (function stmcb. m in Matlab). (vi) Map the obtained poles back to the original frequency domain by operations inverse to the abovepresented ones. Zeros cannot be utilized as easily, thus we don't use them in this study for the final modeling. There may also be poles that correspond to the truncated frequency band edges, thus needing to be excluded. Therefore only relevant poles are directly useful parameters. When applying pole-zero modeling, the selection of the number of poles has to be made appropriately according to the characteristics of the problem. The number of poles should correspond to the order of the resonator to be modeled. For example a partial ('harmonic') of string vibration is composed of two polarizations, thus the partial may exhibit more than one peak in the frequency domain and beating or two-stage decay in the time-domain envelope. Figure 2 depicts the spectrum of a plucked guitar sound with a subplot zoomed into one partial having two spectral peaks. A proper number of zeros in FZ-ARMA modeling is needed to make it able to fit the phases of the decaying sinusoids as well as modeling of the initial transient. Often this number is not very critical, and it can be somewhat higher than the number of poles. The zooming factor Kzoom can be selected so that the analysis bandwidth contains most of the energy of the resonances to be modeled, keeping the order (number of poles) manageable.
Page 00000003 2.3. Modification of inharmonicity Based on the FZ-ARMA analysis we can now modify the frequencies of the partial in the following way. (i) Analyze each partial of the given recording by the FZ-ARMA analysis in the zoomed frequency domain. Tune the fundamental frequency and the inharmonicity factor so that the main spectral peak of the partial remains in the middle of the zoomed band. (ii) Map the partial back to the original frequency domain but by moving the modal frequency to a new desired frequency. Repeat this for each partial and collect the resynthesized partial components. (iii) Resynthesize the partials also at the original frequencies by the method in (i)-(ii) and subtract this from the original signal to get a residual signal containing the pluck and the body response. Crop the most prominent part from the beginning as the residual for resynthesis. (iv) Resynthesize the new modified version from the frequency-shifted partials and the residual for a new fundamental frequency and inharmionicity (or a fully harmonic version). 3. LISTENING TESTS Perception of inharmonicity was studied in two formal listening tests. The first experiment explored the general audibility of inharmonicity in acoustic guitar tones versus similar but harmonic tones. In the second experiment, detection thresholds of inharmonicity were measured for two pitches based on the results of the first experiment. Seven subjects participated in both experiments, including author HJ. None of them reported a hearing defect, and all had previous experience from psychoacoustic experiments. All played some string instrument but none were professionally trained. For training, the subjects were presented all the test material twice before experiment 1. The tests were performed in a silent room using headphones. Test stimuli were synthesized using FZ-ARMA models of a high-quality classical guitar with nylon strings. Inharmonicity coefficients B were varied such that the resulting inharmonicity either matched to, or was below or above, the original. A harmonic reference tone was generated for each inharmonic stimulus with B 0. 3.1. General audibility of inharmonicity in guitar tones The ability to detect an amount of inharmionicity typical of the acoustic guitar was explored at six pitches: frets 1 and 12 of strings 1, 3, and 6. The inharmonicities of the test tones were matched to the original recorded tones (see Table 1). According to initial listening, detection was extremely hard if the residual (the pluck and body response) was included. The plucking transient was therefore left out in the first experiment. The test followed the 2AFC (two-alternative forced choice) paradigm. In each trial, subjects were presented with String, fret fo [Hz] B String 1, fret 1 349 0.000047 String 1, fret 12 659 0.00017 String 3, fret1 207 0.00015 String 3, fretl12 392 0.00052 String 6, fret 1 87 0.0000205 String 6, fretl12 165 0.00007 Table 1. Test tones and their inharmonicity coefficients in Experiment 1. 100 % 75% 50% ----- ----- ----- --- (11) (1,12) (3,1)r (3, 12) (6, 1) (6, 12) (string, fret) Figure 3. Boxplot of the results of experiment 1. two pairs of tones, one consisting of two harmonic tones and the other of the harmonic tone and the respective inharmonic tone. The task was to identify, which pair contained the inharmonic tone. The inharmonic tone was randomly placed in either sound pair. For each pitch, the question was repeated 16 times. The percentage of correct responses was recorded from the last 12 trials. The results are presented in Fig. 3. Detection is clearly above chance level (50 % correct) for strings 3 and 6: the means range between 82 % correct for the 1st fret of string 6 and 94 % correct for the 12th fret of string 3. For string 1, the results are generally under 75 % correct, which suggests that the subjects have been mainly guessing. The experiment was re-run for the 12th fret of strings 3 and 6, for which inharmonicity was detected most clearly, this time including the residuals. The performance dropped from nearly 100 % correct to nearly chance level (see Fig. 4). Thus it seems that inharmonicity is mainly perceived during the very beginning of the tone. An explanation may be that in string instrument tones the higher partials, which are shifted most from their harmonic places, decay faster than the lower ones, thus making the effect fade out quickly. 3.2. Perception threshold for individual tones Since detection of inharmonicity remained uncertain even at the lowest pitch range, detection thresholds were measured for the 12th frets of strings 3 and 6. The 2AFC paradigm was now used with the adaptive 1 up, 2 down staircase procedure, giving the 70.7 % correct point on the psychometric function (Levitt 1971). After two correct
Page 00000004 1 0 0 %.............................. 75%.. 50%.. With residual I (3, 12) (6, 12) (string, fret) SWithout residual (3, 12) (6, 12) (string, fret) Figure 4. Results of the 12th fret of strings 3 and 6 with and without the residual (left and right panel, respectively). 10-3 10-4 "5. 10 -10-6 A Q o ~. i.A 00o -- Thr. (*) A Thr. string 3, fret 12 0 Thr. string 6, fret 12 A B, string 3, all frets 0 B, string 6, all frets 10-' 80 100 200 300 400 500 Frequency [Hz] trolled. The first experiment studied general audibility of inharmonicity at six pitches along the pitch range of the guitar from strings 1, 3, and 6. Inharmonicity was generally detected at strings 3 and 6. However, perception seemed to be strongest at the very beginning of the sound and was interfered by the presence of the plucking transient. This is understandable because the most inharmonic higher partials decay fast in string instrument tones and become easily masked by the plucking transient. In the second experiment, detection thresholds were measured for inharmonicity for the 12th fret of strings 3 and 6. The mean thresholds were close to, although above, typical amounts of inharmonicity in the guitar. However, they were significantly higher than the previously measured thresholds for generic, string instrument-like sounds. As the previous test tones decayed at the same rate for all partials, the partials of the current tones decayed at increasing rate at higher frequencies, probably making the effect become masked by the plucking transient. Two of the subjects detected much lower inharmonicities than the others. This suggests that expert listeners might detect even a weak timbral effect while normal listeners would not notice any difference. It is therefore recommended that for accurate synthesis of guitar tones, inharmonicity should be included for the lowest 4 strings. For less critical applications it could be well ignored. 5. ACKNOWLEDGEMENTS The work was supported by the Academy of Finland, project numbers 105651 and 106953. The authors are thankful to Mr. Jukka Rauhala for analysis tools and to Mr. Henri Penttinen for recordings of guitar sounds. 6. REFERENCES Fletcher, H., Blackham, E. D. and Stratton, R.: 1962, Quality of piano tones, J Acoust. Soc. Am. 34(6), 749-761. Jarvelainen, H., Valimiki, V. and Karjalainen, M.: 2001, Audibility of the timbral effects of inharmonicity in stringed instrument tones, Acoustics Research Letters Online 2(3), 79-84. http://ojps.aip.org/arlo/. Karjalainen, M., Esquef, P., Antsalo, P. and Valimaki, V.: 2002, Frequency-zooming ARMA modeling of resonant and reverberant systems, J Audio Eng. Soc. 50(12), 1012-1029. Levitt, H.: 1971, Transformed up-down methods in psychoacoustics, J. Acoust. Soc. Am. 49(2), 467-477. Makhoul, J.: 1975, Linear prediction: A tutorial review, Proc. IEEE 63, 561-580. Rocchesso, D. and Scalcon, F.: 1999, Bandwidth of perceived inharmonicity for musical modeling of dispersive strings, IEEE Trans. Speech and Audio Proc. 7, 597-601. Schneider, A.: 2000, Inharmonic sounds: Implications as to "Pitch", "Timbre" and "Consonance", Journal of New Music Research 29(4), 275-301. Figure 5. Currently measured thresholds (with error bars of one standard deviation up and down), respective B values for guitar strings, and the threshold measured in (Jarveliinen et al., 2001) (*). responses, inharmonicity was reduced in the next trial, while after only one correct or an incorrect response, it was increased and the algorithm was reversed. The measurement stopped after 12 reversals, and the threshold was computed from the mean of the last eight reversals. B was varied in linear steps of about 1/7 of the original measured B. Fig. 5 presents the results in relation to the B coefficients measured for all frets of strings 3 and 6, and the thresholds measured in (Jirvelainen et al. 2001), as a function of fundamental frequency. While the current thresholds are far above those measured previously, they are relatively close to the realistic inharmonicity coefficients. The mean threshold for fret 12 of string 3 was B= 0.00097 and for fret 12 of string 6 B = 0.00015. These are 1.87 and 1.65 times the respective B coefficients measured from guitar tones. However, since two of the subjects performed much better than the remaining five, it might be useful to re-run the experiment with expert listeners to see if their thresholds are significantly lower. 4. SUMMARY AND CONCLUSIONS The effect of inharmonicity in the acoustic guitar was studied by two formal listening experiments by using synthetic guitar tones whose inharmonicity could be accurately con