Page  00000145 AN INTRODUCTION TO MUSICAL SEMANTICS: THE GESTURE IN MODELIZATION Ole Kihl Independent researcher ABSTRACT Musical meaning is systematically grounded in human experience [5]. The semantic relationship between musical structure and the biological constraints of the human brain and auditory system will be described in the following. It is largely mediated through the musical gesture, and therefore tied to the built-in tendency of human temporal perception to organize input in chunks. These ideas, which are of interest inside the manmachine paradigm, are supported by recent research in neuro-biology and psychology. 1. INTRODUCTION The building of mathematical models and computer simulations of human behavior must be grounded in our knowledge of human action, perception and cognition. In this theoretical paper I aim to provide an overview of research pertaining to a description of musical cognition. I shall further offer some speculations based on this body of knowledge that can be useful in further modelizations of musical behavior inside the manmachine paradigm. In developing and refining models of musical behavior that can be tested against real-life musical production, something like motivation and intentionality becomes important. Music is meaningful, and musical acts convey emotion and intentional behavior. Meaning, then, becomes a factor to be considered in modelization. I propose that the semantic dimension of music be seen as linked to the musical gesture. Evidence from psychology [9,11] and neuroscience [15] supports the view that a musical phrase represents the simulation of a motor act [2,5,6]. Therefore the musical gesture can be seen as an intentional and meaningful act. This brings the perceptual mechanism of chunking into the foreground. 2. THE QUESTION OF MEANING IN MUSIC The often discussed question of meaning in music remains unresolved. In order to build a satisfactory model we must examine the concept of meaning itself, the 'meaning of meaning'. The problem seems to lie with the traditional, narrow definition of meaning derived from language studies, where the meaning of a word is determined as the precise definition found in a dictionary or a similar text. Words are defined by words. A broader definition of 'meaning' has been made possible by cognitive science, which sees meaning as based on acquired embodied experience. For some event to be meaningful, it must in some way comply with prior experience, and when we follow this line of 'interpretants' backwards in personal time, we find that experience is ultimately based on simple, generic bodily experience. According to George Lakoff and Mark Johnson, language itself derives its meaning from such abstract structures [6], and, according to Merlin Donald, language has developed from a primordial level of mimetic communication, the sharing of gestures [2]. A proper definition of musical meaning seems to be suspended between these two extremes: on the one hand, music cannot 'mean' something the way a word or a sentence 'means'; but, on the other hand, we have all had the experience that music can convey moods, emotional content, etc. In order to develop our understanding of the way human beings use music as an expressive and communicative strategy, it therefore seems feasible to study what recent relevant research has taught us about the biological conditions for, and the cognitive structures involved with, music. 3. BIOLOGICAL CONSTRAINTS The human auditory system processes huge amounts of sonic information, often referred to as the auditory stream [1]. When listening to a piece of music, what enters the ears as sound waves or kinetic energy ultimately emerges as music in the human mind. Auditory perception pre-attentively organizes and structures this information in certain ways, partly dependent on innate, biological properties of the system, and partly dependent on learned, culturally derived mechanisms (which will not be considered here). Builtin properties of the auditory system are sometimes seen as manifesting musical universals, features of music that can be found in all cultures at all times [4,13]. One of the most interesting of the properties of the auditory system may be that of pattern extraction [1,10]. The human brain needs to organize its input in order to avoid overflow of information. Infants are extremely well equipped to recognize and extract regularities and patterns from their surroundings, and language learning, for instance, is completely dependent on this ability. In music, such features as 1) the 145

Page  00000146 perception of simple pitch ratios, including octave similarity, fourths and fifths, and 2) musical pulse (regularity extraction) are probably tied to this ability [4]. Other interesting features that occur universally are: 3) the categorization of notes, and the division of the scale in between 5 and 7 unequal steps; 4) perception of melodic contour (upwards and downwards movement); 5) the formation of groups; and 6) the meter as an evoked response [4] (see table 1). 1. Perception of simple pitch ratios (octaves, fourths and fifths) 2. Regularity extraction 3. Categorization of notes in scales of 5 to 7 steps 4. Perception of melodic contour 5. Group formation 6. Meter as evoked response Table 1. Biological constraints on musical perception. This type of characteristic shapes all simple music everywhere and at all times (of course, art music will often transgress these limitations). To a Western ear it will be interesting to note the absence on the list of three musical features that we would expect to be generic: the major/minor - happy/sad distinction; chords and harmonic development; and the so-called 'Mozarteffect', according to which large-scale architectural properties of musical form are important devices. Extant evidence does not support the view that these musical elements are given by nature, however, this whole question is too complex to be given proper treatment here. 4. TEMPORAL PROPERTIES OF THE BRAIN > 30 msec Minimum distinction window 300 msec Pre-attentive window 3 sec Subjective present > 30 sec Extended present Table 2. Temporal constraints on musical cognition. As seen from the standpoint of the cognitive construction of music, the temporal properties of the brain show some noteworthy features. The temporal aspect of human cognition is not only exceedingly complex, but it is also difficult to investigate even at the current level of technology. How does the brain process temporal events in real-time? No doubt, re-entry loops are involved in an on-line integration of memory structures with perceived content [16]. Certain timewindows have been proposed as innate properties of the brain, and they can be considered as a second set of biological constraints comparable to those given above (see table 2). There is a minimum distinction window at 10-30 ms, which is of little interest here, below which one sound becomes indistinguishable from another [7,8]. More interesting is the pre-attentive window at approximately 3-500 ms, which marks the boundary between two distinct modes of perception [3]. While the shorter sounds seem to reverberate through the auditory system (echoic memory?), some of the properties of the longer sounds are sustained in working memory, possibly via re-entry loops. Of even more interest is the window of the subjective present, sometimes called the three second window [7,8,11]. Inside this, the unstructured flow of temporal information is organized as a single temporal event. Such events cannot be shorter than a half second, and are seldom longer than 6-8 sec. Normally they range from 2-5 sec. The mechanism seems to work in a uniform way across modalities: for language, motor events, visual events, and music. Most likely, this organization of temporal events in chunks (in the present context called musical gestures [5]) of a certain size is necessary for the brain to avoid overflow and chaos. It indicates a deeper, amodal level of perception, as claimed by Daniel Stern and others [9], and, as it has been shown to be functional in neonates, it can be seen as an innate property of the brain [11I]. One more time-window should be mentioned here, namely that of the extended present [8,11]. It concerns our ability to keep a limited number of events in view at the same time, and to conceptualize them as units at a higher level of organization. Such formations are comparable to structures like sentences, groups of motor-actions with a common purpose, or simple songs. 5. CHUNKING Both the expression and the perception of music are ruled by the same biological constraints. Perhaps the most striking feature in the evidence presented above, is that auditory perception organizes the auditory stream in chunks of a certain size. This process is largely brought about through a top-down projection of culturally derived schemas: we hear (more or less) what we expect to hear, and we organize our sonic perception according to culturally derived rules. There is evidence of chunk formation from several sources. The gestalt quality of grouping found in primitive perception, as discussed above, is one [1]. Another set of evidence derives from the mechanism of binding, first found in visual perception but later thought to be of a more general nature and called 146

Page  00000147 selective binding [2]. Yet another set of evidence can be deduced from the peculiar quality of projection or transposition of temporal events from one modality to another, primarily the mapping of movement patterns (gestures) on to sound patterns [5,9]. Finally we see that most music - like language and other temporal event series - is organized in melodic or rhythmic phrases of a certain size [8]. The musical chunk - or gesture - represents a mesolevel of cognitive organization, wedged between the microlevel and the macrolevel. The mesolevel seems to be the basic or generic level, at which we interact with the world. When we access the world at this level of more or less pre-organized chunks, the brain is saved the huge burden of work necessary to process millions of tiny bits of information, and can move directly to the operational modus. At the mesolevel, notes, rhythmic information, and sound patterns are organized in a single gestalt, a musical phrase or gesture. From the level of the chunk we can 'look down' towards the microlevel of subchunks, single items of information on pitch, timing and sound. And we can 'look up' towards the macrolevel of superchunks, where chunks are grouped at a higher level of cognitive organization, primarily as melodies [14]. 6. THE SONG Itsy Bitsy spider climbing up the spout Down came the rain and washed the spider out Out came the sun and dried up all the rain Now Itsy Bitsy spider went up the spout again! As a working hypothesis we can say that chunks segment the auditory stream and compress several types of information, such as melodic curve, intensity, dynamics, and timbre, into a single gestalt. At a higher level in the temporal hierarchy these chunks are organized in groups or sequences, such as sentences and melodies. It is of interest to note that nursery rhymes and children's songs from all cultures are organized according to a simple schema, in compliance with the chunking hypothesis (see fig. 1 and 2): a four-line stanza in moderate tempo, where each line lasts about three seconds and the whole stanza twelve seconds [11 ]. In fig. 1, we see a well-known children's song. Each line consists of paired gestures, as shown in fig. 2. In the sharing with others of activities like this, infants train and exercise their basic brain capacities of pattern extraction, chunking and grouping. This seems to be a basic form of organizing our cognition of events in the world, in which the micro-, the meso- and the macrolevel are integrated in purposeful action. 7. CONCLUSION In discussing the biological constraints on musical perception, the most noteworthy feature may be the looseness of the constraints. Our basic biological equipment for music does not bring us anywhere near the notions we have of what music is: important features like harmony and musical form are absent at this level of primary adaptation. But when we think about the great variety of music in the world's cultures, it clearly must be so. Music is a cultural artifact, and most of its properties take on their particular value inside its cultural setting. At the generic level, the temporal constraints on perception become important factors to be considered. Not only nursery rhymes, but all kinds of songs and functional music is organized according to the temporal properties of our auditory system: the division in chunks, subchunks and superchunks is ubiquitous. This leads us to the notion of musical gesture. The idea of the musical gesture includes 'a strong sensorimotor component and a tight coupling between perception and action' [12]. The sequential chunking of the auditory stream of sound is a vital builtin necessity of the perceptual system. The chunk, however, is not a gesture - it is merely a segmentation of time, demonstrating the limitations of our memory systems. The musical gesture, as an inner simulation of movement, arises in response to certain qualities of the patterns in the auditory stream (perceived for listeners, projected for players). But the importance of this phenomenal action should not be overlooked in attempts to build modelization of musical behaviour that are truly musical. F igur e 1. Structure of a nursery rhyme. Figure 2. 'Chunking' a prototypical song. 147

Page  00000148 8. REFERENCES [1] Bregman, A. Auditory Scene analysis, The MIT Press, Cambridge, Mass., 1991. [2] Donald, M. A Mand so Rare. W. W. Norton & Co., New York, 2001. [3] Fraisse, P. "Rhythm and Tempo". Deutsch, D., The Psychology of Music. Academic Press, New York 1982. [4] Justus, T. and Hutsler, J. "Fundamental Issues in the Evolutionary Psychology of Music: Assessing Innateness and Domain specificity" Music Perception 23/1, 2005. [5] Kiihl, 0. Musical Semantics, Peter Lang, Bern, 2007. [6] Lakoff, G. and Johnson M. Philosophy in the Flesh. Basic Books, New York, 1999. [7] London, J. Hearing in Time. Oxford University Press, Oxford, 2004. [8] Snyder, B. Music and Memory The MIT Press, Cambridge, Mass., 2000. [9] Stern, D. The Interpersonal World of the Infant. Karnac, London, 1998 (1985). [10]Tomasello, M. Constructing a Language. Harvard University Press, Cambridge, Mass., 2003. [11] Trevarthen, C. "Musicality and the Intrinsic Motor Pulse", Musicae Scientiae, Special Issue, 2000. [12] Leman, M. & Camurri, A. "Understanding Musical Expressiveness Using Interactive Multimedia Platforms", Musicae Scientiae, Special Issue 2006. [13] McDermott, J. and Hauser, M. "The Origins of Music: Innateness, Uniqueness, and Evolution", Music Perception, 23/1, 2005. [14] Godoy, R.I. "Gestural-Sonorous Objects: embodied extensions of Schaeffer's conceptual apparatus." Organised Sound, 11, 2006. [15] Gallese, V. "The Inner Sense of Action - agency and motor representations." Journal of Consciousness Studies; 7/10, 2000. [16] Edelman, G. Bright Air, Brillant Fire. Basic Books, New York, 1992. 148