Page  00000001 Spectral analysis of timing profiles of piano performances Paul Trilsbeek, Peter Desain & Henkjan Honing Music Mind Machine group, NICI, University of Nijmegen P.O. Box 9104 6500 HE Nijmegen The Netherlands Abstract In this paper a method is described which uses spectral analysis to capture systematic tempo and timing variations in timing profiles of piano performances. The method is tested on a large set of MIDI piano performances of two songs by the Beatles by performers from different backgrounds. It proved possible to demonstrate differences in the pieces and between performers using this method. 1 Introduction Musicians use a large amount of expressive freedom in timing and tempo. Therefore robust systems that retrieve temporal information from musical performances, such as beat trackers or quantizers, are still hard to design (Large 1995, Toiviainen 1999, Cemgil et al. 2000, Longuet-Higgins 1987). Adding knowledge about the behavior of musicians (e.g., related to background or training) to such systems can improve the performance. In this paper we will investigate whether there are systematic differences between groups of pianists in the way their tempo fluctuates while playing the same piece. Also we will look whether the global tempo at which a piece is performed has an influence on the tempo fluctuations, as the size of expected changes in tempo might be determined by style and global tempo. The method we developed for this is comparing power spectra of the timing profiles of these performances. If at a certain position in the spectrum a high value occurs, it means there is a lot of tempo change on that time scale in the performance. 2 Hypotheses One of our hypotheses is that for the different groups of performers with different backgrounds, there will be peaks at different metric levels. More specifically, jazz performers will have more local timing patterns, and classical players will have a more global regularity reflecting the phrase structure. Another hypothesis is that absolute timing and tempo fluctuations will be less at higher tempi, as they may be perceived in proportion to the interval durations themselves, something that has been shown in previous research (Repp 1995, Desain & Honing 1994). This would result in lower peaks in the spectrum at higher tempi. 3 Data collection In order to collect a systematic data set of MIDI performances, twelve pianists were asked to perform two popular songs by the Beatles in three tempo conditions. There were three groups of pianists, namely professional classical performers, amateurs. professional jazz players, and Yesterday Lennon/McCartney 1 13 --.. Pian r rI fi:r J J ir ^ J> IJ I' f r r '^ r JLJ 1, i .t ~ J J i" i: f. rl 12 "t " r f '' fr U i-tfr' r " U9:~ I,^,r ^=r,.J J g ^-,. i i i.~ - i i(.6 Figure 1: The score of the arrangement of Yesterday as presented to the pianists.

Page  00000002 The pieces they had to perform were piano arrangements from Yesterday (fig.1) and Michelle (fig. 2). The tempo conditions were 'normal tempo', ' fast, but still musical', and 'slow, but still musical', all according to the judgment of the performer. Three repetitions were recorded for each tempo, which gave us a total of 12*2*3*3 = 216 performances. The subjects were asked to practice the pieces at home before the recording session. Michelle Lennon/McCartney Piano7 3 o 3 whenever no note was present. Two different interpolation methods were tested, one being linear interpolation in time; the other was using splines for interpolation in time to get smoother transitions. Of the resulting lists of onsets with interpolations, the interonset-intervals (IOI's) were calculated (fig.3). Next, the means over all repetitions of these signals per group of performers and per tempo condition were calculated. This signal was centered around zero and used as the input for the FFT. 0.35 0.3 0.25 linear interpolations ) 20 40 60 80 100 notenumber 120 140 160 zIlz Figure 2: The score of the arrangement of Michelle. The recordings were done using a Yamaha Disklavier Pro C3 grand piano, connected via an Opcode Studio 5 MIDI interface to a PowerMac G3 running Opcode Vision DSP sequencing software. 4 Analysis 4.1 Pre-processing Before being able to calculate a spectrum from a MIDI performance, some processing of the data was necessary. First, the MIDI performances were matched to the original score using a Structure Matching algorithm (Heijink et al. 2000), which allows us to remove errors and to separate the performances into different voices. The top voice (melody) was extracted from these matched files, and in order to obtain a measurement on every eighth-note position, interpolations were made Figure 3: Timing profiles: the IOI signal of the melody from one performance of Yesterday, with linear interpolations (top) and spline interpolations (bottom). 4.2 Spectra The spectra were calculated using MatLab 5.2.1. A Blackman window was used and the signal was zeropadded. Since the eighth-note level was used as the sampling frequency, the maximum frequency obtained in the spectrum is the quarter-note level, which in both pieces corresponds to a quarter of the bar. One has to be careful when calculating a spectrum from a limited amount of samples. Outliers in the performance (such as errors or missing notes), errors in the matching process (e.g. in the case of ornaments or runs of notes) or a large slowing down at the end of the performance can result in a different spectrum. Averaging the spectra of several repeated performances reduced this sensitivity. One also has to take into account that harmonics of the periodic components may appear in the spectrum. With this method of spectral analysis it is also hard to capture certain structural aspects like final phrase lengthening, as phase and direction of timing and tempo fluctuations are not reflected in the spectrum. Also non-periodic expressive devices like e.g. an isolated micro-pause will not be captured by this method. In this respect it resembles the autocorrelation method as applied by Desain & Vos (Desain & Vos 1990) to timing data of a classical and a contemporary piece.

Page  00000003 4.3 Differences between spectra The focus when looking at these spectra is the total amount of fluctuation and whether at certain metric levels (multiples of the quarter note) peaks occur. If a performer has a certain systematic timing fluctuation - whether that is more global in the phrase structure, or more local at the bar or note level - a peak in the spectrum occurs. When comparing different spectra, one can then compare whether these peaks occur at the same place and whether they differ in height. When analyzing the structure of the pieces, one can already make some predictions of where the periodicities reflecting the phrase structure and possible repetitions will occur. For example in Yesterday (fig. 1) there is a 7 -bar phrase that is repeated twice. In Michelle (fig. 2) there is a 6-bar phrase that is repeated. In the spectrum then one would expect peaks at these levels and possibly at harmonics of it. constant tempo (i.e. use less tempo rubato) than the classical players. The structural levels (i.e. beat, bar, phrase) at which the peaks appear don't differ very much for the different groups of performers though. 7 6 5 4 3 2 bars 1/2 1/4 Michelle, classical players: slow S- - - normal fast 01I/"\ 7 6 5 4 3 2 bars 1 1/2 1/4 7 6 5 4 3 2 bars 1/2 1/4 bars Figure 4: The spectra of the mean of all normal tempo performances of Yesterday and Michelle per group of subjects. Figure 4 shows the spectra of the mean of the normal tempo conditions for the three groups of performers, for both pieces. A log scale is used for the time axis. There is a clear difference between the two pieces, Yesterday having much more energy at the smaller levels (below the bar) than Michelle, which would indicate that the performances of Michelle in general were flatter, with less expression. If we look at the larger levels, in Yesterday there are peaks at the 7 bar and 3.5 bar level, which can be explained by the phrase structure of the piece. In Michelle however the expected peaks at the 6 bar and 3 bar levels are not clearly present. Only the classical players show a peak at the 3 bar level. When looking at the different groups of performers, it can be seen that in both pieces the jazz musicians on average have lower peaks at the larger levels than the classical players, which indicates that they play more at a Figure 5: The spectra of the mean of all classical musician's performances of Yesterday and Michelle per tempo condition. Figure 5 shows that for both pieces indeed the global tempo at which the pieces are performed influences the amount of timing and tempo fluctuations. The faster the pieces are performed, the lower the peaks in the spectra, especially at the larger levels. This means timing and tempo variations at higher tempi are smaller. Note that we used absolute time intervals, so whether these variations are also smaller when considered proportional one cannot conclude from these pictures. Again here it is clear that for Yesterday there is much more energy at the lower (i.e. local) levels of the musical structure, also in the different tempo conditions. 5 Conclusions Using spectra to analyze timing profiles is a rather elegant approach to detect periodic tempo or timing fluctuations. Due to the usually small number of sample points available in a piano performance however, the method is quite sensitive for outliers, therefore averaging over several performances is needed. Yet, we were able to show that for the dataset both the global tempo at which the piece is played and the background (i.e. experience and training) of the musician influence the amount of tempo and timing fluctuation. This kind of information can be used for parameter estimation for robust beat tracking models.

Page  00000004 6 Acknowledgments Thanks to Richard Ashley (North Western University) for recording a part of the piano performances and to all the pianists who participated. This research is supported by the Technology Foundation STW, applied science division of NWO and the technology programme of the Dutch Ministry of Economic Affairs. References Cemgil, T., Desain, P., and Kappen, B. 2000. "Rhythm Quantization for Transcription". Compluter M~usic Journal, 24(2):60-76. Desain, P., & Honing, H. 1994. "Does expressive timing in music performance scale proportionally with tempo?" Psychological Research, 56:285-292. Desain, P., & Vos, 5. 1990. "Autocorrelation and the study of musical expression". Proceedings of the 1990 International Comp7uter Music Conference. International Computer Music Association, pp. 357-360. Heijink, H., Desain, P., Honing, H., and Windsor, W. L. 2000. "Make Me a Match: An Evaluation of Different Approaches to Score-Performance Matching". Compluter M~usic Journal, 24(1):43-56. Large, E.W. 1995. "Beat tracking with a nonlinear oscillator". Working notes of the IJCAI Workshop on Al and M~usic. International Joint Conferences on Artificial Intelligence, pp. 24-31. Longuet-Higgins, H.C. 1987. M~ental Processes. Cambridge, Massachusetts: MIT Press. Repp, B.H. 1995. "Quantitative effects of global tempo on expressive timing in music performances: Some perceptual evidence". Music Perception, 13:39-57. Toiviainen, P. 1999. "An interactive MID! accompanist". Comp7uter M~usic Journal, 22(4):63-75.