Page  00000001 The Local Boundary Detection Model (LBDM) and its Application in the Study of Expressive Timing Emilios Cambouropoulos Austrian Research Institute for Artificial Intelligence, Vienna, Austria email: emilios @ai.univie.ac.at Abstract In this paper two main topics are addressed. Firstly, the Local Boundary Detection Model (LBDM) is described; this computational model enables the detection of local boundaries in a melodic surface and can be used for musical segmentation. The proposed model is tested against the punctuation rule system developed by Friberg et al. (1998) at KTH, Stockholm. Secondly, the expressive timing deviations found in a number of expert piano performances are examined in relation to the local boundaries discovered by LBDM. As a result of a set of preliminary experiments, it is suggested that the assumption of final-note lengthening of a melodic phrase is not always valid and that, in some cases, the end of a melodic group is marked by lengthening the second-to-last note or, seeing it from a different viewpoint, by delaying the last note. 1 Introduction Expressive performance of a musical work relies to a large extent on the underlying musical structure. From traditional music performance theories to contemporary computational models of musical expression a strong link between musical structure and expression is assumed. For instance, in a series of experiments, Widmer (1996) has shown that learning rules of melodic expression by computational means is significantly improved when structural aspects of the music, such as motivic and phrase structure, are taken into account. In this paper, firstly a computational model for finding local boundaries in a melodic surface will be proposed. Secondly, this model will be used for studying expressive timing deviations at the discovered boundaries. 2 Finding Local Boundaries 2.1 The Local Boundary Detection Model The Local Boundary Detection Model (LBDM) calculates boundary strength values for each interval of a melodic surface according to the strength of local discontinuities; peaks in the resulting sequence of boundary strengths are taken to be potential local boundaries. Other models (Tenney and Polanski 1980; Lerdahl and Jackendoff 1983) that determine melodic boundaries are based on formalisations of the Gestalt principles of similarity and proximity that amount to finding larger pitch/time/dynamic intervals in between smaller ones. Such models, however, detect no boundary, for instance, in the following rhythmic sequence: J J, J J J J even though a listener easily hears a possible point of segmentation. It has been suggested that a more general approach should account for any change in interval magnitudes. An early version of the LBDM model (Cambouropoulos 1996, 1997) has shown the potential of this approach (empirical support for the model is presented in Battel and Fimbianti, 1998); it has been demonstrated that both Tenney & Polanski's and Lerdahl & Jackendoff's models are special cases of the proposed model. The refined version of LBDM presented herein is more advanced than the earlier version in that it can be applied on any performed melodic sequence (not only on quantised score-like representations) and it also takes into account the degree of change between time/pitch intervals (this way boundaries can be determined at various hierarchic levels). The proposed refined version of the Local Boundary Detection Model (LBDM) is based on two rules: the Change rule and the Proximity rule. The Change rule is more elementary than any of the Gestalt principles as it can be applied to a minimum of two entities (i.e. two entities can be judged to be different by a certain degree) whereas the Proximity rule requires at least three entities (i.e. two entities are closer or more similar than two other entities). Change Rule (CR): Boundary strengths proportional to the degree of change between two consecutive intervals are introduced on either of the two intervals (if both intervals are identical no boundary is suggested). Proximity Rule (PR): If two consecutive intervals are different, the boundary introduced on the larger interval is proportionally stronger. The Change Rule can be implemented by a degree-ofchange function - see suggestion in the description of the LBDM algorithm (bold text in box). The Proximity Rule can

Page  00000002 be implemented simply by multiplying the degree-of-change value with the absolute value of each pitch/time/dynamic interval. This way, not only relatively greater neighbouring intervals get proportionally higher values but also greater intervals get higher values in absolute terms - i.e. if in two cases the degree of change is equal, such as sixteenth/eighth and quarter/half note durations, the boundary value on the (longer) half note will be overall greater than the corresponding eighth note. An example of the application of LBDM in illustrated in Figure 1. In the description of the algorithm (bold text) only the pitch, 101 and rest parametric profiles of a melody are mentioned as these have been used in the experiments reported in this paper. It is possible, however, to construct profiles for dynamic intervals (e.g. velocity differences) or for harmonic intervals (distances between successive chords) and any other parameter relevant for the description of melodies. Such distances can also be asymmetric - for instance the dynamic interval between p-f should be greater that between f-p. The LBDM algorithm (refined version) A melodic sequence is converted into a number of independent numeric interval profiles Pk for parameters such as: pitch (pitch intervals), ioi (interonset intervals) and rest (rests - calculated as the interval between current offset and next onset). Upper thresholds for the maximum size of intervals should be set, e.g. the whole-note duration or 3 seconds for ioi & rest and the octave for pitch; intervals that exceed the threshold are truncated to the maximum value. A parametric profile Pk is represented as a sequence of n intervals of size xi: Pk = [xb x2,...,n] where k E [pitch, ioi, rest}, xi ~ 0 and i E {1,2,...n} The degree of change r between two successive interval values xi and xi,l is given by: r x = x,+x if xi+xi,1~O0 riIi+1 Xi + Xi+l1 0 if Xi =- Xi+1= 0 (N.B. A small value may be added to the size of all the intervals, e.g. 1 semitone to pitch intervals, so as to avoid irregularities introduced by intervals of size 0). The strength of the boundary si for interval xi is affected by both the degree of change to the preceding and following intervals, and is given by the function: Si = Xi (r+_l,i + ri,i+1) For each parameter k, sequence Sk = [sb s2,... S,] is calculated, and normalised in the range [0, 1]. The overall local boundary strength profile for a given melody is a weighted average of the individual strength sequences Sk (weights used in current experiments: wpitch=0.25, wioi=0.50, wrest=0.25). Local peaks in this overall strength sequence indicate local boundaries. 2.2 Evaluation of LBDM Friberg et al (1998) developed a set of punctuation rules that mark low-level structural boundaries in a melody. This rule system was adjusted and tested against the punctuation data provided by an expert performer for a set of 52 melodies (the musician marked manually on the melodic scores his preferred punctuation positions). As it seemed rather plausible that the punctuation marks of a melody were closely related to local boundary positions, an attempt was made to test the LBDM algorithm against the punctuation rule system on the same melodic data. The results obtained are depicted in Table 1. Overall the LBDM performance was comparable to the performance of the punctuation rule system. The results were worse for the default settings of the LBDM algorithm but the performance of the algorithm improved when groups of one note were allowed, the cutoff threshold for the important peaks was adjusted and the weights for the different parametric profiles altered (best values: Found=74% and Extra=49%). A possible advantage of LBDM over other systems is its simplicity and generality. Perf Punctuation Rule System LBDM (default setting) Npe NRule Name Found % Extra % NLbdm NS Found% Extra% 498 501 345 69 31 567 312 63 45 Table 1 Comparison of the Punctuation Rule System and LBDM. NPerf, NRule & NLbdm are the numbers of punctuation boundaries indicated by the expert performer, the rule system and LBDM respectively; Nsame is the number of marked positions that are same for the performer and each of the automated systems; Found is the percentage of the performer's marks found by the rule system and LBDM; Extra is the percentage of marks inserted by the rule system and LBDM not marked by the performer. It is clear that the LBDM is not a complete model of grouping in itself (neither is the Punctuation Rule system); extensions of the current model (e.g. harmonic component) and also complementary models for establishing musical groups via melodic similarity (see Cambouropoulos 1998) are necessary. The grouping suggested by the composer, by means of slurs, breath marks and so on, should also be taken into account (using slurs in LBDM has been avoided as this, in some sense, defeats the point of grouping analysis since slurs indicate one possible grouping in advance). Integrated models for musical segmentation are still in their infancy. 3 Timing Deviations at Boundaries It is commonly hypothesised that the ending of a musical group, such as a melodic phrase, is marked by a slowing down of tempo, i.e. relative lengthening of the ending notes and especially the last note (e.g. Todd 1985). In the Punctuation Rule System mentioned above, the last note of a melodic group is lengthened and a micropause inserted. The boundaries between 'smaller melodic units, typically

Page  00000003 consisting of 1-7 notes' are marked in the performance 'by means of a micropause combined with a small lengthening of the interonset duration' (Friberg et al 1998, p.272). An alternative hypothesis is that a salient note, such as the last note of a melodic group, may be accented by being delayed rather than lengthened; this amounts to a lengthening of the second-to-last note - see brief overview in (Parncutt 1994). In the following sections two experiments will be described that examine the relation between local boundaries and corresponding interonset intervals. 3.1 Experiment on 7 piano Sonatas by Mozart In the first experiment the LBDM was applied on the soprano parts of 7 complete Mozart piano sonatas (K279 -283, K332, K333) performed by a well-known Viennese pianist on a computer-monitored Boesendorfer SE290 concert piano. This melodic dataset consists of approximately 21000 notes. For each soprano part both the LBDM and timing deviation curves (ratio of nominal over performed IOIs) are computed. Initially, a boundary strength threshold was selected that split the LBDM values into strong (roughly 25% of all the notes) and weak values. The timing deviation values were divided into three categories: longer, shorter and same (~3% from ratio 1). Then, the percent of the three types of 101 deviations in relation to the two boundary strengths were calculated (see Table 2, first six columns). In the last three columns of Table 2 the percentages of lengthened or shortened notes on boundaries (i.e. local peaks in the LBDM curve) are depicted. In both cases it is seen that the percentage of lengthened notes on stronger LBDM positions and on boundary peaks is roughly double than those shortened. LBDM I weak I strong II boundary peaks 101 dev. shorter same longer shorter same longer ihorter same longer Notes % 39% 24% 37% 28% 24% 48% 29% 24% 47% Table 2 From the above small experiment it is shown that the IOIs of notes on boundaries tend to be lengthened. However, these results are much weaker than the aforementioned assumption that notes on boundaries are lengthened. Actually only roughly half of the notes on boundaries are lengthened as the other half remain the same or are shortened. What might be the reasons for such a discrepancy? Three possible reasons are given below: a. The boundaries detected by LBDM are partly wrong. b. Deviation curves are not totally reliable in terms of which notes are lengthened/shortened. c. The assumption that all the notes at the end of melodic groups are lengthened is wrong. The IOI deviation curves are quite reliable as they are computed in relation to a local average tempo so that slowing down or speeding up is taken into account for calculating the 101 deviations (ratio of notated over performed IOIs). A relative reading of the curves is also possible (see next section) whereby each note is judged as to whether it has been lengthened/shortened proportionally more or less than its neighbouring notes - this way any problems in terms of determining absolute note lengthening/shortening is eliminated. In order to study the final-note lengthening assumption in more detail, a further controlled experiment was designed whose primary aim was to bypass the problem of uncertainty in boundary detection; this was achieved using a short simple musical piece for which the LBDM would give boundaries that clearly agree with the structure of the melody from the viewpoint of musical experts (Figure 1). A"I ý j 1 wpm *? t nw 4 5 6 S10 Figure 1 The first few bars of Chopin's Etude OplO, No3. The lower curve in the graph depicts the LBDM boundary strength profile (peaks in this curve indicate potential local boundaries). The cluster of curves that oscillate around the value 1 on the vertical axis depicts a subset of 10 interonset interval deviation curves (out of the 22). The lower the points of these curves the slower the tempo is. Note that tempo curve points that occur just before the peaks of the LBDM curve have a much lower value, i.e. they are played relatively slower than the surrounding notes. 3.2 Experiment on 22 Chopin performances In this experiment the LBDM was applied on the first 20 bars of Chopin's Etude OplO, No3. This piece has a rather clear low-level grouping structure which is determined by rather long notes in between short ones (approximately 17 boundaries). The LBDM detects correctly all the important boundaries with only one exception (boundary between bars 15 and 16). This piece was performed by 22 different expert Viennese pianists on a computer-monitored Boesendorfer SE290 concert piano. This dataset consists of approximately 2200 notes. For each of the 22 performances an 101 deviation curve was computed for the melodic part. Boundary strengths in Table 3 are calculated as explained in the previous subsection. It can been seen that

Page  00000004 notes with stronger LBDM boundary values and notes on boundary peaks tend to be lengthened in terms of the IOI. LBDM I weak I strong II boundary peaks 101 dev. shorter same longer shorter same longer horter same longer Notes % 46% 13% 41% 31% 18% 50% 21% 31% 48% Table 3 However, the overall number notes on boundaries that are lengthened is again less than 50%. This is not in good agreement with a strong assumption of end-note lengthening. The second-to-last note lengthening assumption was tested for the 11 strong boundaries (i.e. 242 instances for the 22 performances). The following interesting results emerged - see Table 4. Almost all of the notes before the last note of the melodic group are lengthened. The relative lengthening of the second-to-last and last notes was also examined (this eliminates any problems with the deviation curves in terms of computing note lengthening/shortening as to a local tempo average) and it was found that 89% of the cases the second-to-last note was relatively more lengthened than the last note (see Figure 1). LBDM Strengths boundary peaks 101 Deviations shorter same longer Percent of Notes 3% 5% 92% Table 4 When examined more closely, these strong boundaries correspond to boundaries inserted after a long note that is preceded by short notes (e.g. J>). One possible hypothesis regarding the lengthening of the penultimate (short) note is the following: When a note IOI is long in relation to its surrounding notes, further lengthening should be quite significant in absolute terms for it to be perceptible whereas a much smaller lengthening of a preceding short note (delay of long note) is more effective. These results give some evidence that the commonly hypothesised principle of last-note IOI lengthening need not be always valid, and that the end of a melodic group may be emphasised by other means such as delaying the last note (see also Widmer 2001). Further research is necessary to determine in which case each of these expressive principles is employed. 4 Conclusions In this paper, firstly, the Local Boundary Detection Model (LBDM) was described. This computational model is capable of detecting local boundaries in a melodic surface based on two Gestalt-related principles: identity/change and proximity. The proposed model was tested against the punctuation rule system developed by Friberg et al. (1998) on the same dataset of 52 melodies. It was shown that overall the LBDM performance was comparable to the performance of the punctuation rule system (possible advantages of the LBDM is its generality and simplicity). Secondly, the expressive timing deviations found in a number of expert performances of piano works by Mozart and Chopin were studied in relation to the local boundaries discovered by LBDM. More specifically the common hypothesis that the final note of a melodic group or phrase is lengthened was examined. It was shown that this hypothesis is not always valid and that often the end of a melodic gesture is marked by lengthening the second-to-last note rather than the last note (i.e. delaying the last note). Acknowledgements This research is part of the project Y99-INF, sponsored by the Austrian Federal Ministry of Education, Science, and Culture in the form of a START Research Prize and support to the Austrian Research Institute for Artificial Intelligence. The 52 melodies were generously provided by Anders Friberg of the KTH Music Acoustics Group in Stockholm, the Mozart piano sonatas by the Bisendorfer company and the Chopin data by Werner Goebl at OFAI in Vienna. References Battel, G.U. and Fimbianti, R. (1998) Aesthetic Quality of Statistic Average Music Performance in Different Expressive Intentions. In Proceedings of the XII Colloquium of Musical Informatics, Gorizia, Italy. Cambouropoulos, E. (1998) Musical Parallelism and Melodic Segmentation. In Proceedings of the XII Colloquium of Musical Informatics, Gorizia, Italy. Cambouropoulos, E. (1997) Musical Rhythm: Inferring Accentuation and Metrical Structure from Grouping Structure. In Music, Gestalt and Computing - Studies in Systematic and Cognitive Musicology, M. Leman (ed.), Springer-Verlag, Berlin. Cambouropoulos, E. (1996) A Formal Theory for the Discovery of Local Boundaries in a Melodic Surface. In Proceedings of the III Journees d' Informatique Musicale, Caen, France. Friberg, A., Bresin, R., Fryd6n, L. and Sunberg, J. (1998) Musical Punctuation on the Microlevel: Automatic Indentification and Performance of Small Melodic Units. Journal of New Music research 27(3):271-292 Lerdahl, F. and Jackendoff, R. (1983) A Generative Theory of Tonal Music, The MIT Press, Cambridge (Ma). Parncutt, R. (1994) A Perceptual Model of Pulse Salience and Metrical Accent in Musical Rhythms. Music Perception 11(4):109-464. Tenney, J. and Polansky L. (1980) Temporal Gestalt Perception in Music. Journal of Music Theory, 24: 205-241. Todd, N.P. McA. (1985) A Model of Expressive Timing in Tonal Music. Music Perception 3:33-58. Widmer, G. (1996) Learning Expressive Performance: The Structure-Level Approach. Journal of New Music Research 25(2): 179-205. Widmer, G. (2001) Inductive Learning of General and Robust Local Expression Principles. In Proceedings of the International Computer Music Conference (ICMC'2001), Habana, Cuba.