Time-Frequency Reassignment for Music Analysis

Hainsworth, Stephen W.; Wolfe, Patrick J.

PDF
Print
Share+
- Twitter
- Facebook
- Reddit
- Mendeley

Time-Frequency Reassignment for Music Analysis

Hainsworth, Stephen W.; Wolfe, Patrick J.

Volume 2001, 2001

Permalink: http://hdl.handle.net/2027/spo.bbp2372.2001.041

Permissions: This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. Please contact mpub-help@umich.edu to use this work in a way not covered by the license.

For more information, read Michigan Publishing's access and usage policy.

- 150% +

image text pdf

Time-Frequency Reassignment for Music Analysis Stephen W. Hainsworth* and Patrick J. Wolfet Signal Processing Group, University of Cambridge Department of Engineering, Trumpington Street Cambridge CB2 1PZ, UK {swh21,pjw47 }@eng.cam.ac.uk http://www-sigproc.eng.cam.ac.uk Abstract Time-frequency reassignment may be viewed as a refinement of the short-time Fourier transform, in which phase information is used to reduce the smearing of energy associated with the standard spectrogram. However, even given the perceptibly clearer visual representation yielded by the reassignment method in the case of musical signals, the task remains of extracting useful information from it for further processing. To this end it is proposed that time reassignment information be used to help identify musical transients, and that frequency reassignment information be similarly employed as a means of estimating the pitch of musical signal components. To illustrate these ideas, an example is shown in which reassigned time and frequency points are used to segment a monophonic piano melody and locate the partials of its individual notes. Lastly, the potential role of reassignment in the overallframework of music transcription is described, and several areas are detailed for future study. 1 Introduction Time-frequency reassignment was first introduced as a means of improving the readability of time-frequency representations (Kodera, Gendrin, and de Villedary 1978); it has also been used to aid the analysis of spectrograms (Plante, Meyer, and Ainsworth 1998), as well as for formant tracking in speech analysis (Plante and Ainsworth 1995). From the viewpoint of an engineer it may be considered to be a postprocessing step following the short-time Fourier transform (STFT). Whereas the STFT represents all the energy in a particular windowed signal as a point on the time-frequency lattice corresponding to the centre of the window (determined by the block length and overlap), time-frequency reassignment shifts these coefficients away from the lattice to the *Material by the first author is based upon work supported by the George and Lillian Schiff Foundation. tMaterial by the second author is based upon work supported under a U.S. National Science Foundation Graduate Fellowship. centre of gravity of the windowed energy. Intuitively, timefrequency reassignment in the case of the spectrogram uses information from the phase spectrum to sharpen the amplitude estimates in time and frequency. Unfortunately, this phase information, having been used to reassign the amplitude coefficients, is no longer available for use in signal reconstruction. 2 Time-Frequency Reassignment It is first necessary to provide a more detailed description of time-frequency reassignment for the spectrogram. In this case the field of reassignment vectors is related to the phase of the STFT (Kodera, Gendrin, and de Villedary 1978). In practice, the reassigned time and frequency points, F and &, respectively, are given by the following ratios of STFTs using different windows (Auger and Flandrin 1995): t STFTth SSTFTh STFTdh} twSTFTh where 91 and 3 denote real and imaginary parts, respectively. The reassigned spectrum may thus be computed using three STFTs: one using window h alone, one using the same window weighted with a time ramp t -h, and one using a window whose Fourier transform is weighted with a frequency ramp (i.e., the derivative of the window h with respect to time) dh/dt.1 The time-frequency reassigned spectrogram retains many desirable properties, such as an energy density interpretation, positivity, and so on; furthermore, it is perfectly localised for chirps and impulses. However, because time-frequency 1This computation may also be reformulated to eliminate complex division, and may be even further reduced for the choice of a Gaussian window (which minimises uncertainty in the time-frequency plane), since in this case dh/dt o< t -h (Chassande-Motin 1998). 0