A Modal Distribution Approach to Piano Analysis and Synthesis

Guevara, Rowena; Wakefield, Gregory

ï~~A Modal Distribution Approach to Piano Analysis and Synthesis1 Rowena Cristina L. Guevara and Gregory H. Wakefield Department of Electrical Engineering and Computer Science University of Michigan, Ann Arbor M148109 gev@eecs.umich.edu ghw@eecs.umich.edu Abstract This paper introduces a method of analysis that points to additive synthesis techniques that accommodate different performance techniques including dynamics, articulation (legato, portato, staccato), and initial damper position. Previous attempts at such synthesis have been hampered by the resolution limits in time and frequency of the short-time spectral analysis used to extract the time-evolving partials. While such analysis sufficiently resolves the long-term decay of each partial of a piano sample, it substantially smooths their onset and thereby robs the synthesized sound of its percussive nature. We present results for piano analysis and synthesis using the modal based on time-frequency (t-f) distributions. 1 Introduction The modal distribution has been designed specifically for signals that can be represented as a sum of isolated time-varying partials [Pielemeier and Wakefield, 1996]. Unlike other distributions (e.g., the spectrogram, the Gabor transform, or linear transforms of the Wigner distribution), the modal distribution minimizes the cross-term ambiguities associated with bilinear t-f distributions while maintaining limited superposition. These two properties allow us to apply standard Hilbert techniques for each isolated partial to estimate the instantaneous amplitude and frequency. The analysis suggests how different piano performance techniques can be incorporated into an additive synthesizer. We present analyses of piano sounds that vary in the above-mentioned performance techniques. Based on these analyses, piano sounds were synthesized. Psychophysical testing demonstrates that the proposed method yields perceptually accurate synthesized piano sounds. 2 Analysis Using a digital audio tape, notes were recorded from three Steinway pianos. The sounds were transferred to a PC and analyzed as follows. 2.1 Methodology The software implements the modal distribution which is a time-smoothed discrete pseudo-Wigner distribution. Time-smoothing is introduced by the cross-term filter which implements cross-term suppression. Partials appear as ridges along the time axis. The frequency support for each ridge defines the neighborhood of points over which instantaneous 1 This research was supported by an ESEP Scholarship from the Philippine government to the first author and by funds from the Office of the President of the University of Michigan for the MusEn Project. distribution, which is an alternative representation power and frequency are estimated. Power is estimated as the sum of the distribution over the ridge neighborhood, while frequency is estimated as the centroid of the ridge neighborhood. 2.2 Results of Analysis The physics of the piano lend certain characteristics to the partial structure of the piano sound [Fletcher and Rossing, 1991]. The soundboard favors certain frequency ranges, leading to weak fundamental for bass notes. The striking position of the hammer produces spectral nulls or attenuated modes. The frequency difference between partials increases with partial number as a consequence of string stiffness. Aside from the above, the modal distribution of the piano sound reveals unexpected partials. These occur at frequencies that do not belong to the partial series of the note, even when inharmonicity is taken into account. Characterization of these 'rogue' partials for two-stringed notes leads to the conclusion that the slight mistuning of the string unison gives rise to two partial series, one for each string, each with comparable amplitudes and similar attack and decay characteristics. More surprisingly, rogue partials are present in single-stringed notes, but are characterized by lower amplitude and faster attack and decay. This is consistent with the motion of the massive damped bass string as seen by the bridge through which coupling between adjacent strings occurs. Inclusion of rogue partials in piano synthesis means added computation and memory. The decision to include them was made after conducting an objective, two-interval forced choice discrimination experiment in which the subject's task was to discriminate between synthesized sounds with or without rogue partials. The results for two trained subjects was 100% discrimination; even the two untrained subjects were able to correctly discriminate on 75% of the trials. Guevara & Wakefield 350 eCMC Proceedings 1996 0