Page  42 ï~~CAN COMPUTER MUSIC BENEFIT FROM COGNITIVE MODELS OF RHYTHM PERCEPTION? Peter Desain NICI, Nijmegen University, P.O.Box 9104, NL-6500 HE Nijmegen, the Netherlands One would suspect that computer music could benefit from the present surge in computational modelling of music cognition. However, since most cognitive models still are only applicable in a small domain, this potential. has not yet been fully realised. More general models, based on fundamental concepts, may make their application to computer music easier. Such a general model, in the form of a theory of human rhythm perception is presented, based on the expectancy of events projected into the future by a complex temporal sequence. In computer music the model gives one a practical implementation of difficult processes that are nevertheless essential in interactive composition systems, e.g. quantization, tempo tracking, and beat and metre extraction. Introduction Interesting music is often more about violating laws than about obeying them. One has to be careful as a cognitive scientist not to bring forward one's theory as laws that composers should adhere too. Nevertheless, human music cognition is such a rich and varied field that it can be a source of inspiration for composers. And to a limited extend cognitive models can predict what will be perceived when one listens to a particular compositional construct: information: that can (not should) be valuable to composers. The worth of such cognitive. theories become even more apparent when designing interactive composition systems that contain a listening module (Rowe, 1992), that is intended to extract higher level constructs (e.g. beat, tempo, metre) from the musical material. The immaturity of the field then immediately becomes evident because the many theories that exist only give a partial account of the complexities involved. A quick checklist can help one to asses their characteristics and limitations. Some essential point are: Tine reversals In music, a retrograde rhythm is often hardly recognised as such, and rhythms presented in time reversed order work differently (e.g. induce a different metre). Theories that behave symmetric with respect to time have to be wrong on that basis alone. A related issue is the process-like (left-toright) nature that has to be immanent in the processing, often theories tend to model the perceiver more like a musicologist studying the score, instead of a first time listener. Global tempo The global tempo at which material is presented can cause a different interpretation. Some theories are unaware of time scale, they have to be wrong or are only applicable in specific cases. Expressive timing Expressive timing: the large intended deviations from metronomical timing found in a performance cannot be easily removed by pre-processing. Furthermore the deviations contain valuable information about musical structure. However, many theories are based on an assumed regular time grid. Ambiguity A rhythmic fragment can receive more interpretations at same time. This ambiguity is often intended by the composer and kept intact by the performer. It is difficult to model such ambiguity in a symbolic paradigm. That might be the reason why this issue is often ignored. Context The processing of a new musical fragment depends on what has been perceived before. E.g. categorical perception, a well established field in speech research, has turned out to be quite hard to demonstrate in the rhythm domain. But the general line in this research seems to be that it is only present when enough context is given (Clarke, 1987; Desain & Honing, 1989, 1992). 42

Page  43 ï~~Given the space limitations it is impossible to evaluate different theories here fully with respect to these questions, but 1 will mention a couple of good examples from the area of metre and beat perception. Longuet-Higgins (1976) presents a musical parser that, next to tonal analysis, generates one unambiguous metrical interpretation. It rounds-off expressive timing using an absolute time duration, which makes the model sensitive to global tempo. It chooses one metrical interpretation (no ambiguity) and passes it on to the processing of the next fragment (context dependency). This makes the model asymmetric with respect of time. Palmer and Krumhansl (1991) base their ideas on the mental representation of musical metre around probability density profiles. These can represent ambiguity and asymmetry easily but ignore global tempo, expressive timing and context (it is a zeroth order description). Povel & Essens (1985) elaborate a theory on induction of an internal clock (a beat) by temporal patterns and provide empirical evidence for this model. Their rules for subjective accent make the model asymmetric in time. It can express ambiguity by assigning equal ratings to different clocks, but is unaware of global tempo and expressive timing. Parncutt, (1990) presents a similar model plus evidence using simpler stimuli. The same remarks as made for the Povel & Essens model apply, except for the fact that the dependency of rhythm perception on global tempo is included in the model. Although these models all aim to describe a more or less comparable task, they vary wildly in the methods used. In general, however ingenious these theories may be, their direct applicability to computer music is limited because of the issues mentioned above. This may be caused by the lack of a general principle or construct underlying these theories. A (De)composable theory of rhythm perception As a common base for a general theory on rhythm perception the notion of expectancy seems a promising candidate. It has already an implicit existence in most of the above mentioned theories. In Longuet-Higgins' model for a new fragment a continuation of the tempo and the metre found in the previous one is expected. In Palmer and Krumhansls work the probability densities predict for each position in the bar the amount to wich an onset of a note can be expected. And in Povel & Essens' and Parncutts theory induced clocks can be hypothesised to predict expected onsets at certain time points. In interactive computer music systems the availability of a projected future, the expectancy of events to come, make it possible to process and plan responses ahead. This can help to bring these systems beyond their present state of direct stimulus-response links. The notion of expectancy is the explicit base of a perceptual theory in Desain (1992). This paper elaborates a numerical measure of the 'expectancy of an event' given a temporal pattern as a prior context. Expectancy is defined here to have the quality of composability which makes it possible to base a theory of perception of complex stimuli on a simple model for the perception of their constituting components. Because the expectancy concept seems to deal well with the questions mentioned above, I propose to use it as a common base for theories about temporal perception. In this paper I will explain this construct briefly. Although it is attractive, I have to warn that this is recent work, and empirical verification is still underway. Theory The theory starts by postulating a changing sense of expectancy: the anticipation of a new event after listening to a simple time interval spanned by two events. The presented time interval projects an evolving expectancy curve into the future, with peaks at multiples and divisions of its duration. We can depict the basic expectancy of a event at time B after the preceding time interval A as a curve (see Figure 1). One can see clear peaks in expectancy when B equals A, two times A, three times A and so forth, and when B equals half A, one third of A etc. - w U C M CD) U M o AA 2A 3A 4A 5A 2 time interval B -- Figure L. Basic expectancy of interval B, after presentation of time interval A. 43

Page  44 ï~~The shape of the expectancy curve is determined by our capacity to perceive duration ratios, with higher ratios being more difficult and less expected. The precise shape of these peaks is still an open question. The absolute time duration of the interval A will be used as a second determinant of the expectancy, and the relative heights of the peaks depends on it: a large interval A produces reduced prominence of peaks at multiples of it, while a small interval A generates more pronounced peaks at those points. So Figure 1 represents only one slice through a 3D surface where the duration of A is the second independent variable. A second hypothesis of the theory states that the basic expectancies of all basic time intervals that are implicit in a complex rhythm are just summed to yield the expectancy curve for that rhythm. Figure 2 is an attempt to depict this decomposition. The presented temporal pattern is shown at the left bottom, above which all the intervals implicit in this pattern are indicated. To the right of each interval, the pattern of expectancy projected by that interval is shown. It is based on its duration and the time elapsed after the end of that interval, peaking at integer divisions and multiple. All the projected expectancies are summed and yield the curve at the lower right: the global expectancy curve projected by the complex temporal pattern. It has peaks at time points that can be considered as 'good continuations' of the pattern presented. This concept of expectancy seems closely related to ideas of Jones (1981), and could function as a formalisation of them. The height of the peaks in such curves can be interpreted as the metrical boundary strength and metre can be inferred. This kind of curve can show as well how a temporal pattern fails to realise a high expectancy (like a syncope), or comes up with an unexpected event, modelling perceived tension of various patterns (amount of rhythmicity). it also enables one to see how a new event reinforces the already existing pattern of future expectancy or introduces new elements in it. Regarding the issues on the 'checklist' the following can be remarked: Time reversals The concept of expectancy, as presented here, has no time direction in itself. It is determined completely by two time intervals A and B, but whether each of the time points marking the intervals was presented as stimulus in the past, or is still to happen in the future, is irrelevant for the theory. This does not imply that the theory is symmetric with respect to time, swapping the order of A and B yields different values, because the perception of a time interval followed by a multiple of its duration is be from the perception of that interval followed by a division of it by the same factor. Global tempo The theory is sensitive to the time scale used (the absolute tempo) because the basic expectancy curves depend on the absolute duration of their generating interval. It is clear that what is often called 'the shift of attention through the levels of metrical hierarchy, prompted by different tempi', can be found here in the shift in relative importance of expectancy peaks. Expressive timing All timing is entered directly into the model, there is no need for removal of expressive timing. Moreover, expressive timing tends to make important peaks more prominent and smoothes-out irrelevant details in the profiles. Thus it contributes to the interpretation of a rhythm instead of complicating it. Ambiguity It is tempting to interpret the relative height of the peaks in the expectancy curve directly as a measure of metrical boundary strength. However, instead of deriving a symbolic notion of meter from these curves, it might be more productive to rethink thle concept of meter as a continuous concept, an idealised expectancy curve. This allows the construct to be directly applied to the difficult areas of amount of metrically and rhythms that can induce one or the other metre. implied intervals basic expectancies 1 J J r. i i temporal pattern complex expectancy 0 2 4 F 6 10 12 past now future timeFigure 2. (De)composition of expectancy. 44

Page  45 ï~~Context The expectancy curves become more pronounced when more context is available, this allows e.g. the prediction of sharper categorical perception boundaries in the presence of more context. Conclusion The presented theory of expectancy seems a promising candidate as a basis for rhythm perception Its decomposability into simple components that model perception of time interval pairs is attractive, not at least because for these simple stimuli it is possible to obtain empirical results, that can then be 'plugged' into the theory to yield predictions for complex temporal patterns. It is clear that a full theory of rhythm perception cannot be based on time alone but has to take other musical parameters into account as well. A possible approach could be the use a notion of salience to weight expectancy contributions of events. It still is an open question whether the complete expectancy curve is available at any time for higher level processes, or that an unfolding expectancy, a changing sense of present anticipation is a more appropriate model and access to future expectations is limited. References Clarke, E. (1987) Categorical Rhythm Perception, an Ecological Perspective. In A. Gabrielsson (Ed.), Action and Perception in Rhythm and Music. Royal Swedish Academy of Music. No. 55 Desain, P (1992) A (De)Composable Theory of Rhythm Perception. Music Perception. 9(4). Desain, P. & Honing, H. (1989) Quantization of musical time: a connectionist approach. Computer Music Journal, 13(3) Desain, P. & Honing, H. (1992) Music, Mind and Machine, Studies in Computer Music, Music Cognition and Artificial Intelligence. Thesis Publishers. Jones, M.R. (1981) Only Time Can Tell: On the Topology of Mental Space and Time. Critical Inquiry. Longuet-Higgins, H.C. (1976) The Perception of Melodies. Nature, 263. Palmer, C. & Krumhansl, C.L (1990)'Mental Representations for Musical Meter. Journal of Experimental Psychology: Human Perception and Performance, 16(4) Parncutt, R. (1990) Spontaneous Tempo, Apparent Speed, Pulse Salience and Event Salience in Musical Rhythms. STL-QPSR, 1/1990 Stockholm: Royal Institute of Technology. Povel, D.J. and P. Essens. (1985) Perception of Temporal Patterns. Music Perception, 2(4) Rowe, R. Machine Listening and Composing with Cypher. Computer Music Journal, 16(1) 45