ï~~A Modal Distribution Approach to Piano Analysis and Synthesis1
Rowena Cristina L. Guevara and Gregory H. Wakefield
Department of Electrical Engineering and Computer Science
University of Michigan, Ann Arbor M148109
gev@eecs.umich.edu ghw@eecs.umich.edu
Abstract
This paper introduces a method of analysis that points to additive synthesis techniques that accommodate
different performance techniques including dynamics, articulation (legato, portato, staccato), and initial
damper position. Previous attempts at such synthesis have been hampered by the resolution limits in time
and frequency of the short-time spectral analysis used to extract the time-evolving partials. While such
analysis sufficiently resolves the long-term decay of each partial of a piano sample, it substantially
smooths their onset and thereby robs the synthesized sound of its percussive nature. We present results
for piano analysis and synthesis using the modal
based on time-frequency (t-f) distributions.
1 Introduction
The modal distribution has been designed specifically
for signals that can be represented as a sum of isolated
time-varying partials [Pielemeier and Wakefield,
1996]. Unlike other distributions (e.g., the spectrogram, the Gabor transform, or linear transforms of the
Wigner distribution), the modal distribution minimizes the cross-term ambiguities associated with
bilinear t-f distributions while maintaining limited
superposition. These two properties allow us to apply
standard Hilbert techniques for each isolated partial to
estimate the instantaneous amplitude and frequency.
The analysis suggests how different piano performance techniques can be incorporated into an additive
synthesizer. We present analyses of piano sounds that
vary in the above-mentioned performance techniques.
Based on these analyses, piano sounds were synthesized. Psychophysical testing demonstrates that the
proposed method yields perceptually accurate synthesized piano sounds.
2 Analysis
Using a digital audio tape, notes were recorded from
three Steinway pianos. The sounds were transferred to
a PC and analyzed as follows.
2.1 Methodology
The software implements the modal distribution
which is a time-smoothed discrete pseudo-Wigner
distribution. Time-smoothing is introduced by the
cross-term filter which implements cross-term suppression. Partials appear as ridges along the time axis.
The frequency support for each ridge defines the
neighborhood of points over which instantaneous
1 This research was supported by an ESEP
Scholarship from the Philippine government to the
first author and by funds from the Office of the
President of the University of Michigan for the
MusEn Project.
distribution, which is an alternative representation
power and frequency are estimated. Power is estimated as the sum of the distribution over the ridge
neighborhood, while frequency is estimated as the
centroid of the ridge neighborhood.
2.2 Results of Analysis
The physics of the piano lend certain characteristics to
the partial structure of the piano sound [Fletcher and
Rossing, 1991]. The soundboard favors certain frequency ranges, leading to weak fundamental for bass
notes. The striking position of the hammer produces
spectral nulls or attenuated modes. The frequency difference between partials increases with partial number as a consequence of string stiffness.
Aside from the above, the modal distribution of the
piano sound reveals unexpected partials. These occur
at frequencies that do not belong to the partial series
of the note, even when inharmonicity is taken into
account. Characterization of these 'rogue' partials for
two-stringed notes leads to the conclusion that the
slight mistuning of the string unison gives rise to two
partial series, one for each string, each with comparable amplitudes and similar attack and decay characteristics. More surprisingly, rogue partials are present in
single-stringed notes, but are characterized by lower
amplitude and faster attack and decay. This is consistent with the motion of the massive damped bass
string as seen by the bridge through which coupling
between adjacent strings occurs.
Inclusion of rogue partials in piano synthesis
means added computation and memory. The decision
to include them was made after conducting an objective, two-interval forced choice discrimination experiment in which the subject's task was to discriminate
between synthesized sounds with or without rogue
partials. The results for two trained subjects was
100% discrimination; even the two untrained subjects
were able to correctly discriminate on 75% of the trials.
Guevara & Wakefield
350
eCMC Proceedings 1996
0