Page  00000001 CONTROLLED REFRACTIONS: A TWO-LEVELS CODING OF MUSICAL GESTURES FOR INTERACTIVE LIVE PERFORMANCES Nicola Orio, Carlo De Pirro CSC - DEI Universita' di Padova, Via S. Francesco 11, 35121 PADOVA (ITALY) orio@dei.unipd.it, cdp@csc.unipd.it ABSTRACT Each musical gesture can be performed with different nuances. According to this, it has been developed a model for a two-levels coding of gestures: mapping of the gestures in given classes, coding of the way the gestures are performed. Using this model it has been developed a system for man-machine interaction in a live performance contest. The interaction is based on the concept of "controlled refractions": a set of musical multiplications of the musician's gestures performed in real-time by the system. The same two-levels model is applied also in the creation of the synthetic performance. Each refraction has many possible nuances and the system can choose among them in order to give new musical ideas to the musician. 1. INTRODUCTION In human interaction each gesture can have different meanings depending on the contest in which it is achieved and on the way it is executed. Each gesture can be performed with different nuances, which carry some additional information about the feelings and the intentions of the performer. This is particularly true in musical performances: the player always communicates with the audience through gestures on a musical instrument that transforms them in sounds. Hence the control of each gesture and its nuances has a central role in musical communication. The gesture of playing a short musical phrase can give different impressions to the audience depending on many performing parameters like, for instance, timing, loudness, articulation, and timbre. An instrumentalist continuously changes her performing parameters in order to transmit different expressiveness to the performance, depending on her feelings and on the musical contest. With this regard, many studies were developed on the communication of emotions in music performances [Gabrielsson and Juslin 1996] and some of these focus on the communication among performers and listeners [Senju and Ohgushi 1987; Canazza and Orio 1997]. In this scenario, a comprehension of the relations among nuances in musical gestures and player's intentions would help to better understand how the communication in music is achieved. Moreover, to understand the meaning of the gestures is relevant also in the more general field of man-machine interaction. The need of a deeper interaction between user and computer became significant in these last years. Computers are used by an increasingly number of persons, who may not have specific knowledge in the use of interfaces; if some agents could understand user's nuances in the gestures needed to drive the interface, then it would be easier for her to interact with the machine. 2. A TWO-LEVELS APPROACH The basic idea of this work is that, in most of the cases, human gestures can be coded using two different levels. At first it can be recognized that a gesture belongs to a particular class: moving an object with the mouse; shaking hands; playing repeatedly a note on the piano. This level usually carries great part of the information about the intentions of the performer. But it can also be analyzed the additional information carried by the nuances in the gesture, that is the way the gesture is performed: moving the object randomly on the screen may show that the user is not sure about where to place it; a "distract" shaking of hands means that the persons are not interested in the meeting; a slow repetition of a tone communicates an impression different from a fast tremolo. 2.1 Coding Gestures in Live Performances This two-levels approach has been used to develop a model for the coding of musical gestures. The model was then applied in a system for interactive live performances on a piano with a MIDI interface. Musical live performance represents a good test-bed for a model on man-machine interaction: there are many interfaces that capture musical gestures and musicians develop a particular skill in fine controlling their instruments. According to the model, musician's gestures on the keyboard are classified in a number of predetermined groups. Once this classification is performed, the nuances are analyzed. Given the inner complexity of the parametrization of the nuances, it was chosen to give them some semantic labels. Hence each musical gesture can have a different degree of "tension" or "urgency" or "gloom". To this intent, it seemed promising to use a mixed approach based on Fuzzy Logic and Backpropagation Neural Nets. 2.2 Man-Machine Interaction Through Music The system replies to pianist's gestures by playing a synthetic performance on the same instrument. The model for

Page  00000002 musical gestures is applied also to computer performance: each musical gestures performed by the computer can be divided in two levels. One level represents the class to which the automatic gesture belongs. These classes depend on compositional choices, in these paper are presented some of the possible musical gestures that can be performed by the system. The second level represents the different nuances with which the musical gestures are performed; that is, all the small variations that can be applied on the synthetic gestures. In Fig. 1 is shown the basic model of the system; in the figure is emphasized the bidirectionality of the communication. The musician interacts with the system with gestures, which are coded in MIDI format. The system plays on the same piano, giving new musical ideas to the performer or enforcing her intentions: to strengthen this bidirectionality it was chosen to give some degrees of freedom in the choice of the nuances in the synthetic gestures. In the figure is also highlighted that a musical performance has sense only if there is also a communication with the audience, that has to share the musical language used by the human and the synthetic performer. MIDI G M e Gesture Nuances u s classification coding u S i e 0 c ce n Synthetic gestures a n Automatic performer Audience Fig. 1: The interactive performance system Hence the system can be considered as a musical duet between a pianist and a computer, like in [Risset and Van Duyne 1996] and in [Orio and De Pirro 1997]. However in this paper we prefer to stress another characteristic of the system, based on the concept of musical multiplication. By replying to musician's gestures with music, the system can act as a set of changing musical mirrors. Also in this case the two-levels approach is used: the kind of multiplication depends on the class to which the gesture belongs, while the way the multiplication is performed (that is, its nuances) depends on the nuances in the pianist's musical gesture. The model for the musical multiplication is based on the principle of "controlled refractions". 3. CONTROLLED REFRACTIONS The basic principle of the controlled refractions is the natural harmonic complexity of sound. In piano performances this complexity increases when the dumper pedal is pressed: the memory of the played tones becomes a veil of sound, enriching the color of the performance. In this sense, the use of the dumper pedal is the simplest kind of harmonic refraction. Considering the presence of a number of frequencies as the capability to change the "harmonic color" of a given melody, means to apply this concept: harmony is related to melody as the adjective is related to the substantive. So the same substantive (melody) can change its meaning depending on the adjective (harmony) by which it is accompanied. The principle of controlled refractions applies this concept. Melodic gestures of the pianist are coded and expanded in different melodies using musical multiplication with variable density. The performance is then enriched with synthetic melodic lines that, as the rotation of a prism, change the color of the melos. The kind of musical refractions and the way they are performed is non deterministic. In this way the nature of the global musical performance (i.e. the sum of pianist an synthetic gestures) can continuously change, depending on the nuances in the synthetic performance. Refractions are hence controlled because they depends on the way the musician is playing, but they

Page  00000003 have also a certain degree of freedom that creates a new, complex musical instrument that the musician can always explore in new directions. The kind of refractions depends on compositional choices. In our work we explored a number of different refractions, but we would like to stress the concept that the system is open to an increasingly number of effects, always related to the model of musical gestures classification and communication through nuances. It is possible to perform, for instance, canon figures on the melodic lines played by the pianist, or to introduce a bordone in particular moments of the performance. The number of voices in the canon, the transposition in frequency, the time interval between two subsequent voices, or the intensity of the bordone, are all choices made by the automatic system among a number of possible ones. It is also possible to create different melodic movements in opposition to the ones played by the pianist; for instance a tone repeatedly played can be multiplied along the keyboard or a ascending melodic line can be opposed to a steady one. Also in these cases, once that a synthetic musical gesture is created, the way it is performed, its nuances, depends on the system's choices. Furthermore the system can create some harmonic accents on pianist performance building a chord over some notes. The kind of the chord and the rhythmic complexity of the performance are, again, choices made by the system. All the different choices made by the system, allow the controlled refraction to be suitable to create different "open compositions". In fact it is clear that, even if the number of classes in which gestures are coded (as well as the description of their nuances and the nuances in the refractions) reflects the musical ideas of the composer, the performance can never be completely predictable. 1 1 very fast fast slow very slow soft medium loud Fig. 2: Rules of fuzzification of Tempo Fig. 3: Rules for fuzzification of Loudness 4. AUTOMATIC CODING OF GESTURES It has been developed a tool for the coding of musical gestures played by a pianist on a MIDI interface. The tool is written is C++ language using the MidiShare toolkit [Orlarey and Lequay 1989] developed at Grame. According to the model of a two-levels interaction, it was chosen to map the gestures in a number of different classes. An algorithm for classification is then used to map musician's gestures in a particular class. First of all it is performed a segmentation of the pianist performance in elementary gestures, such as the repetition of a single note, the performance of a short phrase followed by pauses, two phrases separated by the change of the register, and variations in Tempo or in rhythmic complexity. Then, using a system of thresholds, the algorithm classifies the particular gesture among the ones proposed by the composer. The second step is the coding of the nuances in the gesture, two system are proposed. 4.1 Fuzzy Logic Approach As it was previously highlighted, for the coding of nuances it seems more suitable a semantic description. Fuzzy Logic [Pedrycz 1995; Li and Lau 1989] seems to be the more suitable tool for this kind of description; this especially because it permits to merge the experience of the composer in term of musical language with the engineering problems of extracting some information from the musician's nuances. Hence, depending on the kind of refractions that are to be performed, some Fuzzy Controllers have been developed. For instance, since the musician knows which are the possible effects of mirroring, it was decided to extract an index which depends on the musician's urgency in having a particular refraction. In Fig. 2 and in Fig. 3 are plotted the rules of fuzzification of the two main parameters which describe the way a musician could play a phrase depending on the urgency of her performance. These are the Tempo and the loudness with which the musician is playing, changes in these two parameters, like accelerando or crescendo are also take into account. A fuzzy controller then performs the decision about the level of urgency of the musician. Based on this index, the chosen refraction is then performed with

Page  00000004 different nuances, in order to enforce (or to reduce) the sense of urgency of the pianist. 4.2 A Neural Net Approach An alternative system to extract information from musician's nuances is through the use of Backpropagation Neural Networks [Kosko 1992]. In Fig. 4 is plotted a network used to extract an index of musician's tension. It has five input nodes, three hidden nodes and one output node, which gives the index of tension ranging from 0 (no tension at all) to 1 (maximum tension). They were tried many different geometry for the net; the one proposed is the one which gave best results during the training and have good results in generalization. It was trained with 100 different musical phrases played by one of the authors, each on with a particular scoring in his degree of tension. The generalization ability of the net was tested with other 20 performances, giving satisfactory results kvel Akvel dur Adur art I0 r ) Input rarame~er uescrip~ion kvel average key velocity Akvel variance in key velocity dur average InterOnset Interval Adur variance in InterOnset Interval art Articulation Hidden Nodes Output Tension Degree Fig. 4: Neural Net for the recognition of an index of tension 4.3 Freedom In Synthetic Performance These indexes, as well as all the others that a composer can choose to extract, do not impose a particular nuance. On the contrary, the index dynamically changes the probability by which a particular nuances is applied. Given the fact that each synthetic gesture may have many nuances (in timing, in intensity, in rhythm) and that many indexes can be extracted at the same time, the system is completely open to a deep exploration by composer and musicians. Moreover the system is able to perform refraction also on the synthetic performance. Each kind of refraction is built as a single block and the system is able to drive all blocks independently, using both the musician' s performance and the performances created by other blocks. CONCLUSIONS It has been proposed a model for the coding of gestures using a two-levels approach. Using this model a system for interactive live performances has been developed. The system creates a synthetic performance in response to the pianist performance, following the concepts of controlled refractions. The system can be used to create a number of open compositions in which the resulting performance is obtained by the choice of the different refractions (depending on composer choices) and by the interaction between the human and the synthetic performer (depending on performer gestures and nuances). REFERENCES Canazza S., Orio N. 1997. "How Are Player's Ideas Perceived by Listeners: Analysis of 'How High the Moon' Theme". Proc. of KANSEI - The Technology of Emotions, Genova, pp. 134-139. Gabrielsson A. Juslin P. 1996. "Emotional expression in music performance". Psychology of music. 24: 68-91. Kosko B. 1992. Neural Networks for Signal Processing. Englewood Cliffs: Prentice Hall. Li Y. F., Lau C. C. 1989. "Development of Fuzzy Algorithms for Servo Systems". IEEE Control Systems Magazine, April, pp. 65-7 1. Orio N., De Pirro C. 1997. "Performance with Refraction. Understanding M~usical Gestures for Interactive Live Performance". Proc. of KANSEI-The Technology of Emotions, Genova, pp. 42-47. Orlarey Y., Lequay H. 1989. "MidiShare: a Real Time Multi-Tasks Software Module for MIDI Applications". Proc. of Intern. Computer Music Conference, San Francisco, pp. 234-237 Pedrycz W. 199,5. Fuzzy Sets Engineering. Boca Raton: CRC Press. Risset J. C., Van Duyne 5. 1996. "Real-Time Performance Interaction with a Computer Controlled Acoustic Piano". Computer Music Journal, 20(1): 62-7,5. Senju M., Ohgushi K. 1987. "How are the player's ideas conveyed to the audience?". Music perception. 4: 311-324.