Page  60 ï~~Pattern Processing in Music Robert Rowe Tang-Chun Li NYU Dept. of Music and Performing Arts Professions 35 W. Fourth St. Room 777 New York NY 10012 USA Abstract A real-time pattern processing application has been developed that can learn and identify melodic sequences in a stream of MIDI data. The application is based on earlier work on score following and pattern matching developed by Dannenberg, Pennycook, and others. This paper traces extensions implemented to address the particular problems of realtime pattern induction and identification and outlines the developed algorithm. 1 Problem Patterns are among the most important elements of musical discourse. Recognizable melodic, rhythmic, or harmonic patterns form the materials out of which larger structures are formed. If a computer program is to follow or contribute to music so constituted, the program must be able to process such patterns meaningfully. Pattern processing in music encompasses two goals: (1) learning to recognize important sequential structures from repeated exposure, and (2) matching new input against these learned structures. We will refer to processing directed toward the first goal as pattern induction; processing directed toward the second will be called pattern matching. The goal of the work reported here is to develop a program capable of performing pattern induction and matching in real time on an arbitrary stream of MIDI data. The program must notice sequences that have been repeated several times and find subsequent occurrences of these sequences as they arise in the remainder of the performance. Moreover, we wish to report the match of the incipit of stored patterns such that the processor could predict the continuation of the gesture if it indeed follows the remembered remainder. 2 Related Work Bloch and Dannenberg's paper "Real-Time Computer Accompaniment of Keyboard Performances" describes techniques for matching monophonic and polyphonic performances against stored representations in a score-following task (Bloch and Dannenberg 1985). Their technique maps well to generalized pattern processing, but there are significant differences. First of all, pattern processing requires matching against many different patterns simultaneously. Second, there is no a priori way to know when a pattern may begin as the performer is not presumed to be following a known score. Third, Bloch and Dannenberg's algorithm looks for the absolute pitches of a score, disallowing transposition. To address these differences, our application matches against many pattern templates simultaneously. As input arrives, it is segmented into phrases in the hope that pattern instances will begin at phrase boundaries. Melodic sequences are stored as intervallic contours rather than absolute pitch classes and can therefore be matched in transposed presentations. Machine Recognition of Music 60 ICMC Proceedings 1994

Page  61 ï~~The second algorithm reviewed was the dynamic timewarp technique developed by Dale Stammen and Bruce Pennycook at McGill University [Stammen & Pennycook 19931. The timewarp is a generalized version of the dynamic programming approach advanced by Bloch and Dannenberg. In the McGill implementation, patterns are segmented in real time from an incoming MIDI stream. Identified segments are then compared to a collection of archetypes exhibiting fundamental melodic contours including up, down, up-down, and downup. The goal of Timewarp, the Stammen & Pennycook application, is to classify incoming segments as one of the stored archetypes. New instances of each archetype are stored in a list that can be accessed by a performance module. In this way, desired contours can be realized using material extracted from external performers. The problem addressed by Timewarp differs from ours in some important ways that necessitated extensions and adaptations to the algorithm. For example, Timewarp waits until a segment is complete before matching it against the archetypes. Further, the application is designed to classify segments according to their proximity to the stored contours rather than to find repetitions of any particular motive. To find ongoing matches of pattern templates, we altered the buffering of incoming events such that the matching process is being carried out two events behind real time. This lag allows us to find better segment boundaries than can be noticed at the attack of a event. Keeping the analysis within two events of the current one, however, ensures that the program remains reasonably responsive to new appearances of a pattern. Shifting the emphasis of the analysis from classification of contours to induction and matching was relatively straightforward. We maintain the classification process, but record the vector of local distance measures separating any candidate segment from the nearest archetype. This vector serves as a unique ID for any incoming pattern. These IDs, then, can be used to find repetitions of particular motives when they arise. Another adaptation of the McGill algorithm was to separate interval and temporal distances in the match. Part of the point of the dynamic timewarp is to show the warp of a pattern with respect to the time recorded in the template. Our application, however, is directed to finding repetitions of melodic (or rhythmic) sequences whether or not the other parameter is varied. If the distance metrics for the two are added, there is no way to identify whether intervallic or temporal differences are the source of the found variation. Using separate timewarps for pitch and rhythm allows similar harmonic material to be recognized even when presented in vastly varying rhythmic guises, and vice versa. 3 Segmentation A critical issue for real-time pattern processing is the successful segmentation of the input. An exhaustive search of incoming notes against all locations of all template patterns would quickly overwhelm the available processing (note that this solution is, however, used in David Cope's non-realtime EMI system [Cope 19901). Our segmentation scheme is derived from the one reported in [Stammen & Pennycook 1993] which is in turn related to the rules presented in [Lerdahl & Jackendoff 1983]. Real-time rules look for note durations or rests that are noticeably long relative to the surrounding events. Other rules need one or two events beyond the end of a segment to assert a boundary. In all cases the segmenter is looking for salient discontinuities of interval, duration, and rate of attack. When accumulated discontinuities cross a threshold, a segment boundary is declared. Note that rules evaluating one or two events beyond the boundary (with, therefore, more information than the real-time rules) may override the boundaries asserted by the long note/long rest rules alone. If such overrides occur, the span of the long note/long rest clocks are adjusted to match the observed context. A current research goal is to supplement this basic scheme with the segmentation rules used in Cypher [Rowe 19931. Cypher's ICMC Proceedings 1994 61 Machine Recognition of Music

Page  62 ï~~segmentation similarly relies on noticing discontinuities, but broadens the scope of features observed to include density, register, and harmonic considerations. It is our hope that analyzing a broader context of features will allow more reliable segmentation in real time, rather than after a delay of two or three events, thereby enhancing the responsiveness of the system. 4 Pattern Processor The Pattern Processor algorithm proceeds as follows: (1) incoming MIDI data is treated by segmentation rules and segment boundaries are flagged; (2) ongoing segments are compared with stored pattern templates, and the differences between the candidate and templates are noted; (3) matches in progress with a low distance measure between the candidate and any known template are reported; (4) completed segments are tagged with an identifier made up of the difference vector between the candidate and the nearest template; and (5) if this difference vector is close to an existing vector (previously heard pattern) a new template pattern is formed. The Pattern Processor is able to discover previously unknown melodic patterns presented repeatedly in a stream of MIDI data and to flag ongoing new occurrences of these patterns as the input continues. Such capabilities can be used in composition and performance to give special treatment to salient patterns that have been remembered from previous passes with the system or that are introduced onstage during the course of improvisation. 5 Conclusion and Future Research Real-time pattern processing can add useful musical functionality to interactive music systems. Beyond the analysis and classification of style and gesture currently performed, pattern processing can identify important melodic and rhythmic sequences during performance and flag new instances of these in progress. A combination of realtime segmentation and the induction and matching processes described above has been implemented to provide the desired behavior. At present the system only tracks melodic sequences; several instances of the algorithm should be executed to follow rhythmic and harmonic patterns as well. With information about all three dimensions, a coordination process could note whether patterns were being reiterated with the original rhythmic/harmonic behaviors and if not, in which dimension the source of variation is found. Acknowledgments The research reported in the paper was made possible by a grant from the New York University Research Challenge Fund. Thanks to Dale Stammen and Bruce Pennycook of McGill University for sharing ideas and code. References Cope, D. 1990. "Pattern Matching as an Engine for the Computer Simulation of Musical Style." In Proceedings, ed. Stephen Arnold and Graham Hair. Glasgow: ICMC Glasgow. Lerdahl, F., and Jackendoff, R. 1983. A Generative Theory of Tonal Music Cambridge, Mass.: The MIT Press. Rowe, R. 1993. Interactive Music Systems: Machine Listening and Composing. Cambridge, MA: The MIT Press. Stammen, D., and Pennycook, B. 1993. "Real-time Recognition of Melodic Fragments Using the Dynamic Timewarp Algorithm." In Proceedings. Tokyo: ICMC Tokyo. Machine Recognition of Music 62 ICMC Proceedings 1994