Page  585 ï~~THE INDUCTION OF MUSICAL STRUCTURE USING CORRELATION JASON D. VANTOMME 2216 MAPLE AVENUE EVANSTON, IL 60201 ABSTRACT: The empirical induction of musical structures by software systems can be realized through the use of the statistical data comparison method known as correlation. By using correlation to measure the similarity of pitch or rhythmic fragments ("windows") from a musical work with other equally sized fragments from the same musical work, basic structural generalizations can be made. In support of this assertion, a software tool utilizing a "windowed" correlation technique has been developed. INTRODUCTION The application of correlation is not new to the study of musical problems. Various uses have included the determination of meter (Brown, J.), the detection of pitch in audio signals (Brown, J. & Puckette, M.) as well as the analysis of recorded musical performances (Desain, P. & Vos, S.). The method described here uses correlation as the basis for an inductive inference tool used to discover musical structure. The development of this system was motivated by a general lack of a priori knowledge in early approaches to score following systems (Dannenberg, R., Vercoe, B. & Puckette, M.), more recent approaches (Baird, B. et al., Puckette, M. & Lippe, C.) and the author's own work on the subject (Vantomme). INDUCTION AND CORRELATION Empirical induction techniques are able to create new hypotheses without the aid of domain-specific knowledge (Michalski, R. & Kodratoff, Y.). Though adopting such approaches in an absolute fashion has obvious disadvantages, these techniques can result in immediately useful (and transportable) systems for inducing basic generalizations from raw data. At the heart of any empirical induction system is a method that is able to suggest such generalizations-for the current problem, this method is correlation. Correlation (in deviation-score form) can be expressed: r=XxyI /yx2X y2 where x and y are equal length data sets (Ferguson, G. & Takane, Y.). The result of this computation is the correlation coefficient (r)-a measure of similarity between two data sets. Unfortunately, correlating the pitch or onset values from an entire musical work with itself, in an effort to find internal similarities, is largely uninformative; such a comparison will reveal only repetitions of the complete work within itself. Consider the fragment presented in Fig. 1. One can see how the phrase contains not only two primary divisions, but also further divisions within these primary structures that share either pitch or rhythmic commonalities. Note that ordinary correlation of the fragment with itself produces only one convincing instance of recurring structure (the r value of 0.88 in Fig. 2)-this instance points out the second primary division of the phrase at the start of the third measure (sample position 7). Fig 1: Original Sample (Bach: E-flat Flute Sonata, mvmt II, opening measures) Original sample pitch values (in NZDI numbers) 74 75 74 74 79 75 72 72 74 72 72 82 72 70 69 67 correlation (pos(0- 14), pos(l-15)) = +0.23 correlation (pos(0-413), pos (2-15)) = -0.08 correlation (pos(0-412), pos(3-15)) = -0.04 correlation (pos (0---9), pos(6--15)) = +0.32 correlation (pos (0--8), pos (7--15)) = +0.88 Fig 2: Partial results of Fig. 1 pitch list correlated with itself (i.e., un-windowed). I C M C PROCEED I N GS 199558 585

Page  586 ï~~WINDOWED CORRELATION Though correlating a complete musical fragment with itself might provide several structural boundaries within a musical work, it also may provide none at all-this variable is determined by how many similar sub-fragments, within the whole sample phrase or work, occur in regular patterns. This method not only lacks the ability to determine the structure of a work whose sub-fragments are irregular, it is also unable to discover similarities within and common between the sub-fragments of a phrase; a simple addition of "windows" to the method provides these abilities. Simply stated, windowed correlation is the restriction of the data set size to a likely range within which often-repeated musical fragments (i.e., themes, motives, etc.) will be found. Instead of using a complete phrase or work as a data set, it uses extracted fragments from that sample. For instance, one might assume that the minimum size of a motive in a given musical work will be no less then 3 notes and that the maximum size of a basic motive will be no longer than 10 notes in length. In accepting this range of motive lengths (and thus, potential window sizes), the induction tool is provided a set of guidelines within which to work; though these window sizes are set by the user before analysis, they are merely used as guidelines within which the method will begin its discovery process. For example, if the user has set the maximum window size at 10 and the system finds that a fragment of that size correlates well with one or several other fragments of the same size, it will increase the maximum window size until it finds that the correlation between fragments is less than successful. (Note that setting the "successful" r value threshold is an important issue in both windowed and non-windowed methods.) Original sample pitch values (in MIDI numbers) 74 75 74 74 79 75 72 72 74 72 72 82 72 70 69 67 correlation (pos(0-32), pos(3-35) ) = +0.98 correlation (pos (0-32), pos (7-49)) = +1.00 correlation (pos (0->2), pos(10-412)) = +1.00 correlation (pos (3-5), pos (0-2)) = +0.98 correlation (pos(3-+5), pos (7-9)) = +0.98 correlation (pos(3-5), pos(10-12)) = +0.98 Fig 3: Successful results for Fig. 1 pitch values using windowed correlation (window size = 3). The success of this method can be seen from the results of computing correlation values for the window of length 3 (see Fig. 3); note that sub-fragments related to the sample x(0->2) are discovered at positions 3, 7 and 10 and sub-fragments related to x(3->5) are found at positions 0, 7 and 10. A certain degree of failure can also be found in these results-the motives at positions 0 and 3 might not have been classed as "similar" by a human analyst, and yet, they have been classified as such by this method. CONCLUSION The results of this work suggest that the initial motivation of providing useful a priori data to a score following system in the form of structural information can be met (at least in part) with the method of windowed correlation. Useful additions to the method would include a more detailed set of constraints on the discovery process in an effort to increase computational efficiency as well as a means to use discoveries between works not only for efficiency reasons, but also to produce a large set of data common to musical works, composers and/or styles. REFERENCES Baird, B., D. Blevins, and N. Zahler. 1993. "Artificial Intelligence and Music: Implementing an Interactive Computer Performer." Computer Music Journal 17(2): 73-79. Brown, J. C. and M. Puckette. 1989. "Calculation of a 'Narrowed' Autocorrelation Function." Journal of the Acoustical Society of America (JASA). 85(4): 1595-1601. Brown, J. C. 1993. "Determination of the Meter of Musical Scores by Autocorrelation." JASA 94(4): 1953-1957. Dannenberg, R., and Mukaino, H. 1988. "New Techniques for Enhanced Quality of Computer Accompaniment." Proceedings of the 1988 International Computer Music Conference. San Francisco: ICMA. pp. 243-249. Desain, P. and S. de Vos. 1990. "Autocorrelation and the Study of Musical Expression." Proceedings of the 1990 International Computer Music Conference. San Francisco: ICMA. pp. 357-360. Ferguson, G. A. and Y. Takane. 1989. Statistical Analysis in Psychology and Education. New York: McGraw Hill. Michaiski, R. S. and Y. Kodratoff. 1990 "Research in Machine Learning: Recent Progress, Classification of Methods, and Future Directions." Machine Learning, Volume IIlL Los Altos: Morgan Kaufmann. pp. 3-30. Puckette, M., and C. Lippe. 1992. "Score Following in Practice." Proceedings of the 1992 International Computer Music Conference. San Francisco: ICMA. pp. 182-185. Vantomnme, J. 1995. "Score Following by Temporal Pattern." Computer Music Journal 19(3): (In Press). Vercoe, B. and M. Puckette. 1985. "Synthetic Rehearsal: Training the Synthetic Performer." Proceedings of the 1985 International Computer Music Conference. San Francisco: IGMA. pp. 275-278. 586 IC M C P ROC E E D I N G S 1995