Page  00000001 AUTOMATIC GENERATION OF METRICAL STRUCTURE BASED ON GTTM Masatoshi Hamanaka PRESTO, Japan Science and Technology Agency m.hamanaka@aist.go.jp Keiji Hirata NTT Communication Science Laboratories hirata@brl.ntt.co.jp Satoshi Tojo Japan Advanced Institute of Science and Technology tojo@jaist.ac.jp ABSTRACT This paper describes an automatic system of music analysis based on the generative theory of tonal music (GTTM). The system works by acquiring the metrical structure of the music. The GTTM is considered one of the most promising theories of music in regard to computer implementation; however, no order is given for applying its rules and as a result, they frequently conflict with each other. To solve this problem, we introduced adjustable parameters, enabling us to assign priority to the rules. Our GTTM-based system makes it possible to construct hierarchical metrical structures in a top-down process using bottom-up detection of local metrical strength. Experimental results showed that after these parameters were tuned, our method outperformed the baseline performance. 1. INTRODUCTION We are developing a system of music analysis based on the generative theory of tonal music (GTTM) [1]. The GTTM is composed of four modules, each of which assigns a separate structural description to a listener's understanding of a piece of music. These four modules output a grouping structure, metrical structure, timespan tree, and prolongational tree, respectively. Our goal is to automatically derive a time-span tree that assigns a hierarchy of 'structural importance' to the notes of a piece of music (Figure 1). Automatic generation of a time-span tree from the music surface enables us to analyse the deeper structure [2]. It will also provide music summarization [3] and collaborative music creation [4] systems that will enable users to manipulate music by using a time-span tree, disregarding the surface structure of the music. To derive a time-span tree, we use the grouping and metrical structures of the music. In previous work [5], we developed a system for automatically generating a grouping structure. In this paper, we describe a method for automatically articulating the transition of notes into a metrical structure. Previous methods based on beat tracking [6-9] are only able to acquire the hierarchical metrical structure in a measure because they do not consider larger metrical structure such as two measures, four measures, and so on. On the other hand a mathematical model for inner metric analysis [10] can acquire metric weight of each note. However the method can not acquire the hierarchical metrical structure. We have previously attempted to implement several rules of the GTTM, but when these rules conflicted we were unable to adequately resolve the priority of multiple rules [11, 12]. Although a computer model of GTTM [13] is capable of producing a time-span tree, we, human, need to choose applicable rules at each stage of processing. Our system of metrical analysis based on the GTTM makes it possible to construct a hierarchical metrical structure automatically by iterating calculation of lowlevel beat strength and choosing the next level structure. -- Time-span tree,A,. r M4.....T-.. i... i /~..... ta,4.,,/,,,,,j,,,,HA,,,I-,,,,,l4K -4: ~:: ~: ~: ~::. -- Metrical structure S) - Grouping structure Figure 1. Time-span tree, metrical structure, and grouping structure. 2. METRICAL STRUCTURE ANALYZER BASED ON GTTM The metrical structure describes the rhythmical hierarchy of a piece of music by identifying the position of strong beats at the levels of a quarter note, half note, a measure, two measures, four measures, and so on. Strong beats are illustrated as several levels of dots below the music staff (Figure 1). Figure 2 shows the processing flow of the system. As the primary input formats we chose MusicXML [14] and designed GroupingXML [5], A hierarchical metrical structure MusicXML. GroupingXML Calculation of low- Current structure -***************** level beat strength ilow-level (strength of beat) Applying MPR1,2,3,4,5 [time] 4 Choosing next level structure m1 =* * * * * * * * Choice of next M 2 * * * * *0 0 S level structures = 3 0 * * * * Choosing with applying MPR10 Yes No MetricalXML Figure 2. Processing flow of metrical structure analyzer.

Page  00000002 was constructed in the top-down way, while the lowerlevel beat strength are detected in the bottom-up way. We then designed MetricalXML as the export format for our system; note elements in MetricalXML are connected to note elements in MusicXML and GroupingXML by Xlink [15] and Xpointer [16] (Figure 3). In our experiments, we restricted the music structure to monophony so we could more accurately assess the performance of each rule. ML Xposter XML inter MusicXML Grouping-XML, n MetricalXML -<pa't id= P1" >pn~ fomedsenumess"~ an "rfeece uls Wl-fredni-ess.2 rules ) arenecessary cd +<tidio P1ns for the assignentofasructure structurestisfes the well-formeidn"P1 2 u, 4'e >ref rules t <tth sidtoP1f2> o<su nothe. 0 rules: MPR (p l <e t id"P1) M2 6 +/esr > nt d=P--~ FMigure 3(Mieve M GPR (sroen pMPRXN6Lan MerclX4 2.1.s) APplicti ofe e), MPns ortmedss" andrc d " r rules0 Wl b normednessirul slur, () ariuain (e rpete d ~pithe, nd() aron. aere neessr diconitosfo the assignmentn of aP structure5, and rstri c toite onbthetruea. Wpend o ea rulemoe a o structure satisfiested w el l-foRmedDnes rule, thepreference rules n dcate thc5, p rioritf ne) suctue over anothri. order for iappiongo MPR rls miuiisi h analysi maye r esut Tof solve ths problm, w~e. usel 5c, d, 5e, aind 10) that enableuse toh conrol the strengh oeahrule. iur indicates the sproi f n tuurelatonshi beotween The metrical preference rules (MPRs) consist of 10 rules: MPR1 (parallelism), MPR2 (strong beat early), MPR3 (event), MPR4 (stress), MPRS (length), MPR6 (bass), MPR7 (cadence), MPR8 (suspension), MPR9 (time-span interaction), and MPR1O (binary regularity). MPRS has six cases: (a) pitch-event, (b) dynamics, (c) slur, (d) articulation, (e) repeated pitches, and (f) harmony. Here we discuss the application of MPR1, 2, 3, 4, 5, and 10. The strength of the beat dependent on each rule can be expressed as DiMPR~i (O~IDMPRi~ 1 ) wherej (1, 2, 3, 4, Sa, Sb, Sc, Sd, and Se). Because there is no strict order for applying MPR rules, ambiguities in the analysis may result. To solve this problem, we use adjustable parameters SMZPR~i wherej (1, 2, 3, 4, Sa, Sb, Sc, Sd, Se, and 10) that enable us to control the strength of each rule. Figure 4 indicates the relationship between the parameters and MPRs. Our metrical structure analyzer has sixteen adjustable parameters, which include SMPRIj Wr, and 7fPR i (Table 1). GroupingXML str TM P1 (10 A istat w~ ~JMPR' SM~j i~ i~D~low-levell(" m~ Parameters Description 8MPR j Strength of each rule. j=(1, 2, 3, 4, 5a, 5b, 5c, 5d, 5e, and 10) W, Weight of priority of the same rhythm compared with the same register in parallel segments. TVIPRj Value of the threshold that decides whether or not each rule is applicable.] =(1, 4, 5a, 5b, and 5c) Table 1. Sixteen adjustable parameters. 2.1.1. Calculation of basic parameters As calculated from MusicXML, the five basic parameters of a note from beat i are: velo, (velocity), valu, (length of note), vol1 (duration of dynamic'), slur1 (length of slur), and numi (pitch). /1velo, /valu, /1vol, ftslur, and b#num are the average of the basic parameters. As calculated from GroupingXML, the two basic parameters istart and iend are the beginning and ending of a group that is the smallest group containing i and more than one beat in the current structure. 2.1.2. Application of MPR] MPR1, which is the rule for parallelism in a metrical structure, has two adjustable parameters: Wr (weight of priority of the same rhythm compared with the same register in parallel segments), and TMPR1 (threshold that decides whether beat i and beat k are parallel, i.e., (Di kMPR1-1) Or not (Di kMPR1O)). DIMPR1 i k W+zi k x_ W > TMPR1 1 where endend i velog > 0 ken i velok' > 0 xik+ istart 0 veloi = 0 __start 0 velok' = 0 *end k F ik ie >0 and veloki(+i >0 Yik' C st art [0 else ~endr ik ed i veloi > 0 and numi1_l = numy, and num /ki'-i-1 = nu mkHi -i i= start 0o else 2.1.3. Application of MPR2 and 3 MPR2 has a weak preference for a metrical structure in which the strongest beat in a group appears relatively early in the group. We formalized DMPR2 so that the closer it is to the beginning of the group, the higher the values. MPR3 prefers a metrical structure in which the inceptions of pitch-events are strong beats (FigureS). DMPR2 _( end _ end _jstart) (2) MPR3 veloi > 0 (3) - Grouping structure qo 6 )( in i )( - Metrical structure DiMPR2(Current structure) DiMPR3 Figure 5. Application of MPR2 and 3. 1The GTTM does not define the meaning of 'duration of dynamic'. Therefore, we define it as the length from one beat to the next beat or rest, depending on the example of applying MPR5b in the GTTM [1]. Figure 4. Relationship between parameters and MPRs.

Page  00000003 2.1.4. Application ofMPR4, 5a, 5b, and 5c MPR4, 5a, 5b, and 5c prefer a metrical structure in which a relatively strong beat occurs at the inception of either a stressed note, a long note, a long duration of a dynamic, or a long slur. We introduced the adjustable parameters 7R4PRj (0 < 74PRj <1), where j (4, 5a, 5b, and 5c) to control the value of the threshold that decides whether each rule is applicable, i.e., (DiMPRi=I) or not (DiMPRi-O) (Figure 6). structure (Figure 7). When the current structure contains more than one beat, the next level structure Mz is calculated as follows: Dlow-levelmetrical (i-m)mod2 =0 M= argmax Yi D levelmetrical x SMPRlO (ir- m)mod3 = (11) m(=1,2,3,4,5) i 0else Current structure **************************************** Dilow-level -- [i] Choice of next level situcture:! i */ i * ~ ~ ~ i,,i?i. ~ ~ ~ ~ ~ l ~ D MPR4 DMPR5a DMPRb DMPR5c 0 veloi > 2 x /Pveo x TMPR4 else valui > 2 x /Pval, x TMPR5a else voli > 2 xpvo x TMPR5b else sluri > 2 x pslur x TMPR5c else (4) (5) (6) * *. * *......** ** * M i~ i.i.i.rA '= I Li_ _ _ " m=3.. i U][i [i] (7) L ' ": r-3F a'iw i 1 1; 1 1 " i ' 1--- 1 i i i i i i i i i ANi, ,, $i, ',., SýZTh Current structure velo 2 elo I ii i 1111............. I valu 2 M ju vol 2 M 0o slur 2 /-slur........... -[i] iii i i i I iii i i i i_ ii --------- i]-.......... - - - - - -............... ---- ---- ---- ---- --- ---- --- ---- k=4_[i] =4.... i0 i 0 i0 _ _L___s- - [i] Figure 7. Selecting the next level structure. 3. GTTM EDITOR We developed a GTTM editor to construct grouping and metrical structures, and time-span trees. The GTTM editor has two modes as follows. 3.1. Automatic-analysis mode The automatic-analysis mode displays the results of our grouping and metrical structure analyzer (Figure 8). The structure changes depending on the parameters configured. Figure 6. Application of MPR4, 5a, 5b and 5c. 2.1.5. Application ofMPR5d and 5e MPR5d is the rule for repetition of an articulation pattern. We apply MPR5d to those positions where there are inceptions of MPR5a repetition. MPR5e is the rule for pitch-repetition. MPR5d = DiMPR5a =1 and DLPR5a =1 (8) 0 else MPRSe I D: 0 numi = numi+l else 2.2. Calculation of low-level beat strength Low-level beat strength is calculated by weighted summation of DiMPRj, where j (1, 2, 3, 4, 5a, 5b, 5c, 5d, and 5e). Dlow-level metrical = B + Bk x SMPR where Bi = _ DiMPR j x SMPR j j=(2,3,4,5a,5b,5c,5d,5e) MPR1 Dik = MPR1 ik 0 Figure 8. GTTM editor (automatic-analysis mode). 3.2. Manual-edit mode The manual-edit mode assists in manual editing of the grouping structure, metrical structure, and time-span tree (Figure 9). It can be used to edit the results of our grouping and metrical structure analyzer in automatic-analysis mode....... NNow editing this time-span tree ~ Time-span tree : notes SGrouping structure i i Metrical structure (10) 2.3. Acquisition of hierarchical metrical structure A hierarchical metrical structure is constructed by iterating calculation of the low-level strength of the beat for the current structure and choosing the next level Figure 9. GTTM editor (manual-edit mode).

Page  00000004 4. EXPERIMENTAL RESULTS 6.REFERENCES We evaluated the performance of our metrical structure analyzer using an F-measure, which is given by the weighted harmonic mean of Precision P (the proportion of selected metrical dots that are correct) and Recall R (the proportion of correct metrical dots that were identified). We did not take care that low-level error is propagated up to higher-level; we counted wrong answers in each metrical level. P + R Fmeasure = 2 x x (12) 2. Pei he.60 This evaluation required us to prepare accurate data for the metrical structure. We collected one hundred pieces of 8-bar length, monophonic, classical music (western tonal music that GTTM assumes to analyze), and asked people with expertise in musicology to analyze them manually with faithful regard to the MPRs using the GTTM editor. These manually produced results were cross-checked by three other experts. To evaluate the baseline performance of our system, we used the following default parameters: S4PR= j0.5, Wr=0.5, and T/IPRj_0.5. The average F-measure for the baseline performance was 84%. We took about 10 minutes in average for finding a plausible tuning of the parameter set by hand. As a result of configuring the parameters, the F-measure of our metrical structure analyzer reached 90%, outperforming the baseline by 6% (Table 2). Melody Baseline Our method performance with configured parameters 1. Spinnerlied 0.88 1.00 2. Petit Chien 0.86 0.92 3. Solvajg's Song 0.86 0.86 4. L'Arlesienne 0.92 0.98 5. Tarantella 0.63 0.69 6. The Moldau 0.73 1.00 7. Tristesse 0.88 1.00 8. Waves of the Danube 0.72 0.97 9. Barcarolle 0.72 0.79 10. Gymnopedie 0.66 0.82 Total (100 melodies) 0.84 0.90 Table 2. F-measure for our method 5. CONCLUSION We developed a system for analyzing the metrical structure of music based on the GTTM. The system makes it possible to construct a hierarchical metrical structure. Our experimental result showed that the Fmeasure reached 900%, that outperforms the baseline measure by 60%, with optimally configured sets of the parameters. At present, time-span trees are generated manually using only MPRs together with groupingstructure information provided by GroupingXML. We are now planning to improve the precision of the trees, implementing further details of the time-span tree generation rule. [1] Lerdahl, F., and R. Jackendoff. A Generative Theory of Tonal Music. MIT Press, Cambridge, Massachusetts, 1983. [2] Hirata, K., and T. Aoyagi. "Computational Music Representation on the Generative Theory of Tonal Music and the Deductive Object-Oriented Database", Computer Music Journal 2 7(3), 73-89. 2003. [3] Hirata, K., and S. Matsuda. "Interactive Music Summarization based on Generative Theory of Tonal Music", Journal of New Music Research, 32:2, 165-177. 2003. [4] Hirata, K., S. Matsuda, K. Kaji, and K. Nagao. "Annotated Music for Retrieval, Reproduction, and Sharing", Proceedings of the International Computer Music Conference, Miami, USA, 2004. [5] Hamanaka, M., K. Hirata, and S. Tojo. "Automatic Generation of Grouping Structure based on the GTTM", Proceedings of the International Computer Music Conference, Miami, USA, 2004. [6] Rosenthal, D. "Emulation of human rhythm perception", Computer Music Journal 16(1), 64-76. 1992. [7] Goto, M. "An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds", Journal of New Music Research, 30.2, 159-171. 2001. [8] Dixon, S. 'Automatic Extraction of Tempo and Beat from Expressive Performance", Journal of New Music Research, 30.1, 39-58. 2001. [9] Meudic, B. "A Causal Algorithm for Beattracking", Proceedings of the 2nd Conference Understanding and Creating Music, Caserta, Italy, 2002. [10] Volk, A. "The Empirical Evaluation of a Mathematical Model for Inner Metric Analysis", Proceedings of the 5th Triennial ESCOM Conference, Hanover, Germany, 2003. [11] Ida, K., K. Hirata, and T. Satoshi. "The Attempt of the Automatic Analysis of the Grouping Structure and Metrical Structure based on GTTM", SIG Technical Report 2001(42):49-54, Japan, 2001, (in Japanese). [12] Touyou, T., K. Hirata, and T. Satoshi. "Improvement of Grouping Rule Application in Implementing GTTM", SIG Technical Report 2002(47):121-126, Japan, 2002, (in Japanese). [13] Nord, T. Toward Theoretical Verification. Developing A Computer Model of Lerdahl and Jackendoffs Generative Theory of Tonal Music, Dissertation, The University of Wisconsin, Madison, 1992. [14] Recordare LLC. "MusicXML 1.0 Tutorial" http://www.recordare.comlxml/musicxmltutorial.pdf., 2004. [15] W3C. "XMIL Linking Language (A7Link) Version 1.0. " http://www.w3.org/TR/xlink/, 2001. [16] W3C. "XML Pointer Language (XPointer)." http://www.w3.org/TR/xptr/, 2002.