Page  00000113 INTERACTIVE MUSIC SUMMARIZATION BASED ON INTERVAL DISTANCE PROFILE Yoo Min-Joon Yonsei University Department of Computer Science Lee In-Kwon Yonsei University Department of Computer Science ABSTRACT I ~ ~:I I. This paper presents a simple algorithm that efficiently discovers and extracts repeating patterns in music. The algorithm can detect not only exact repeating patterns, but also partial and approximate patterns. The method is based on the time distance of two interval pairs that are similar and uses the accumulation of the distance for evidence of repeating patterns. Users can recognize repeating patterns as a volume of the accumulation of distance and choose them to summarize music. Our algorithm also can be used to analyze the structure of any arbitrary music because of its flexibility. 1. INTRODUCTION Music summarization is a useful task in many applications. For example, we can identify a piece of music easily through hearing only some themes or specific hooks without hearing the whole piece. One of the aims of music summarization is to effectively find out the identities of music compositions, which are usually presented as repeated patterns. Music summarization also can be used as a means of music data compression because it can detect the repetition of some parts of music data. More importantly, it can help music analysts and composers to analyze the structure of music. In this paper, we present a method for detecting repeating patterns in music and summarizing music. Recently, there are various attempts to segment and summarize music, especially, audio data [1][2][3]. However, there are not much literature of symbolic data [4][5] and we deal with summarization of symbolic data in this paper. We consider a music clip not as a set of notes but as a set of intervals between adjacent notes. Two music clips can be classified as the same repeating patterns when they have similar sets of intervals (see Figure 1). As shown in the figure, the time distances (e.g. the time difference between two note-on messages in MIDI) between corresponding intervals in the repeating patterns tend to be similar. In this paper, we focus on this invariance under timing translation of intervals in similar patterns. We can detect repeating patterns by inspecting the accumulated distances of interval pairs. A significant amount of the accumulation shows that there is high probability that similar F,, ~ V Figure 1. A connected pair indicates two corresponding intervals in a similar pattern. The distances between interval pairs turn to evidence of similarity. patterns exist. We call the accumulation of the distance data interval distance profile. Our method has several advantages. First, it is very flexible. For example, if two patterns are not exactly the same but very similar except for a few notes, the method can treat the two patterns as similar patterns because the time distances made by other intervals, except a few different intervals, are similar enough to provide evidence of similarity. Hence, one can easily find out not only the exact repeating patterns, but also partial and approximate repeating patterns. Second, the method can be used as an interactive music summarization tool [4] because several similar patterns found in the music composition can be seen graphically and users can listen to important parts of the music by selecting the patterns. It can be also used as a music analysis tool because the repeating patterns found using the method may represent meaningful composition elements, such as themes, hooks and specific interval progressions. Our method is similar to some extent to the method of finding symmetries for 3D geometry [6] presented in the computer graphic community. 2. ALGORITHM First, we calculate the intervals I {}, i 1,, N - 1, where N is the number of all notes, between one note and its next note. Then, we make interval pairs: A = {(Ip,Iq), where |Ip - Iq < 0}. The threshold 0 can be used in order to set the degree of a robustness or a exactness. In our experiments, we usually 113

Page  00000114 a b c d. 1- 2 2- 2 2- -1-- - - 2- - - 2 - 2- - - 1- -- 5 -/ )4. -272 -1__27 2 _T_4 ----7 - -7- -2 -2--- Figure 2. A part of a standard jazz song 'Fly Me to the Moon' by Bart Howard. Intervals between two adjacent notes are shown as numbers. set Minor 2md < 0 < Minor 3md because many suitable results are generated when 0 is set in this range. We then calculate the time distances between two intervals in each pair Ai (e A) and plot the distance data in the transformation space. Then, we can detect repeating patterns by a volume of the accumulation of distance data in the transformation space. Figure 2 shows an example. The score shown in Figure 2 is a part of a standard jazz song 'Fly Me to the Moon' by Bart Howard. Each interval between a note and its next note is calculated and shown between them. The set of similar interval pairs in the example is as follows, where two intervals in each pair is similar enough, that is, the absolute interval difference is less than Major 2 d: A = {f(a, b), (a, c), - - -, (a, d),..., (a, e), - - - (b, c),...} We then calculate the time distance between two intervals in each pair in A. Let IP and Iq be two intervals. The time distance D(I, Iq) between Ip and Iq is defined as: distances count distances count 0-2 19 14-16 14 2-4 16 16- 18 22 4-6 8 18-20 7 6-8 19 20-22 4 8-10 26 22-24 7 10-12 11 24-26 7 12-14 3 26-28 2 Table 1. Interval distance profile for the example in Figure 2: the accumulated data of distances between interval pairs in A. We assume that the distance between crotchets is 1. count 20 16 16 distance D(Ip, I,) = |T(Ip) - T(I,)|, I (1) where T(I,) is a temporal position of interval I, which is the time of the first note of the interval. We calculate the time distances of all interval pairs in A and show the data in the interval distance profile in Table 1. We now plot the distances versus their counts in a transformation space, shown in Figure 3, and we use data convoluted by gaussian kernel, exp(-(x - xi)2/(2-2 in order to express the graph more smoothly. The parameter o is used to adjust smoothness of result graphs and we usually set o empirically (in this Figure o 1). In Figure 3, a high peak means there is high probability that one pattern is away from another similar pattern with the distances indicated by the peaks. Thus, we can find repeating patterns by using the distance data. When an interval pair is mapped to the transformation space, its original temporal data is missing. Hence, pairs from unrelated parts of music can be mapped to the same point in transformation space. For example, both interval pairs (a, d) and (d, e) in Figure 2 have the same distance 8, but their temporal positions are not related. Thus, we need a process to extract temporally coherent components. We solve this problem by connecting interval pairs which are related temporally by using the following method. Let I,1 be a set of interval pairs belonging to the peak selected by the user and c is a user-defined threshold. Figure 3. A plot of the distances versus their counts in Table 1 in transformation space. 1. Sort intervals in 18,1 in ascending order of temporal positions. 2. Let 18e6j= {I1, 12,.-.1, i1 be sorted intervals. 3. For each interval 1i in 1,1 (a) Connect 1i and 1l+1 if D(1i12i+~) < c. The results show that each interval in 18,1 is connected to another if their temporal positions are close enough. Figure 4 shows some of the results. Connected intervals are indicated by gray area in the figure. The graph in Figure 3 has another high peak near distance 1. However, this peak has little meaning for finding repeating patterns, since the distance between interval pairs in the peak is very short and the height of this peak only indicates that the note sequences in music consist of many consecutively similar intervals as the melody in our example. 3. EXPERIMENTAL RESULTS 3.1. Finding Themes Figure 5 shows the transformation space, presenting the interval distance profile of an entire song 'Fly Me to the Moon.' Note that the peaks (a) and (b) in this figure correspond to the peaks of distance 8 and 16, respectively, in Figure 3. 114

Page  00000115 (a) (b) Figure 4. Repeating patterns found in Figure 2: (a) Repeating patterns found by investing interval pairs of distance 8, (b) Repeating patterns found by investing interval pairs of distance 16. (c) (a) Figure 5. A graph of transformation space constructed by the interval distances profile of 'Fly Me to the Moon' by Bart Howard. The high peaks (a), (b) and (c) indicate that there are three major repeating patterns in this song The highest peak (c) located in Figure 5 indicates that one theme may be away from another theme as the distance between interval pairs in the peak. In this example, we could find the main theme of this song (see Figure 6) by investigating the intervals included in this peak. In most experiments, the peaks, which mean themes, appeared as a high peak that included interval pairs of a quite long distance. High peaks at low distances ((a) and (b) in Figure 5) sometimes represent hooks because the note sequence made by these interval pairs usually have short length and the frequency of the repetition is relatively high. The method was tested through an example of more complex music with many more notes (i.e., intervals). Figure 7 shows the interval distance profiles of two waltzes of Chopin. There are peaks which prove the existence of repeating patterns in the middle of both graphs. We could find the theme of each music composition by investigating the interval pairs in the peaks. These examples also include several polyphonic notes. We implemented a higher pitch note precedes a lower one if the times of these notes are same in order to consider polyphonic music. 3.2. Adding Variation The flexibility of the method was verified with the addition of various ornaments and changes of several note durations in the theme of 'Fly Me to the Moon' to give some variation to the music. Figure 8 shows the comparison of a graph of the interval distance profile of original music with that of the modified music. Though several notes are changed and added in the music, the appearance of the interval distance profile of the modified music is very similar to that of the original one. Additional experiments also showed that the appearance of the interval distance profile, especially the location of peaks that represent the themes, was nearly constant unless the notes in the theme were changed excessively. 4. CONCLUSION AND FUTURE WORK In this paper, we used intervals between notes as the means of a similarity calculation. However, notes themselves could be used instead of intervals when calculating the similarity of patterns. In our experiments, the note-based method sometimes generated a more exact result showing the location of a repeating pattern than the interval-based method. However, transposed themes or repeating patterns cannot be found using this method. Since themes can appeare in as transposed forms in many music compositions (usually in classical music), we chose to use intervals when calculating the similarity of patterns in order to provide a more general framework in this paper. However, if one is convinced that there is no transposed theme in a music composition in which one wants to find repeating patterns, then a notes-based distance profile can be a possible alternative., I Figure 6. The theme of 'Fly Me to the Moon' found by investigating the interval pairs of peak (c) in Figure 5. 115

Page  00000116 (a) (b) Figure 7. Graphs of interval distance profiles of two waltzes of F. Chopin: (a) Waltz D flat major op. 64-1, (b) Waltz F minor op. 70-2. The high peak in the middle of each graph shows that there are repeating patterns of themes. The pruning of meaningless interval pairs also helps users get the location information of repeating patterns in music more exactly. For example, the interval pairs of very short distances can be discarded when constructing a graph of interval distance profile in order to present the information more clearly. We think that more clear peaks can be generated by pruning in Figure 7. Once an interval distance profile of a music composition is generated, one can easily find several repeating patterns in the music by just selecting the high peaks in the graph of the interval distance profile. Generation of interval distance profile takes up much of the processing time, which is a preprocessing step for a given song. When a user selects a peak, the response time is very fast. Thus, our method can be used as an efficient, interactive music summarization tool. We think our work lacks a means to test our results statistically. we are exploring suitable statistical significance testing to develop our work more meaningfully. For future work, we are also exploring ways to apply our method to audio signals in order to summarize audio files. Since our method is quite simple and fast, we hope that a more efficient and faster audio summarization system can soon be implemented. Acknowledgements This research is the result of the promotion project for culture contents technology research center supported by Korea Culture & Content Agency (KOCCA). 5. REFERENCES [1] Chai.W. "Semantic Segmentation and Summarization of Music", IEEE Signal Processing Magazine, 2006 [2] Levy.M., Sandler.M. and Casey.M. "Extraction of high-level musical structure from audio data and its application to thumbnail generation", Proceedings of ICASSP, 2006 [3] Paulus.J. and Klapuri.A. "Music structure analysis by finding repeated parts", Proceedings of lst ACM workshop on audio and music computing multimedia, 2006 [4] Hirata, K. and Matsuda, S. "Interactive Music Summarization based on GTTM", Proceedings oflSMIR, 2002 [5] Huron.D. "Perceptual and Cognitive Applications in Music Information Retrieval", Proceedings oflSMIR, 2000 [6] Mitra, N.J., Guibas, L.J. and Pauly, M. "Partial and Approximate Symmetry Detection for 3D Geometry", Proceedings of the SIGGRAPH 2006 (b)... ':'. " ": A '. ' -.: 1. visa Figure 8. (a) Graph of interval distance profiles of the original 'Fly Me to the Moon', (b) Graph of modified version of the same song. The differences of two profiles are quite small. The location of high peaks is seldom changed despite several added ornaments and changes of note duration. 116