Page  00000001 Feel the M/usic: Narration in Touch and Sound M/aura Sule O'M/odhrain Center for Computer Research in Music And Acoustics Stanford University sileeccrma. stanford.edu http: //www.ccrma.stanford.edu/ "sile Ab str act We describe the development of tools which allow us to track the relationship between multiple performances of a piece. Based on certain premises derived from Gestalt thinking, we develop a model for the relationship between music and haptics which allows us to explore the idea of a "lnarrative" element for haptic display. We report on preliminary work and suggest future directions. 1 Introduction Any study which attempts to analyse musical performance must begin by tackling one knotty problem how to define what is performance and what is piece. Mechanical instruments and computers have enabled us to render, more accurately than is humanly possible, the element of music that is the piece. What we seek now is a way of rendering the not-the-piece, the element which we shall hereafter refer to as "lthe performance." Why are we interested in separating the two? The ability to independently control those elements which define performance opens up several exciting possibilities. Firstly, we can build a new class of instruments which already know how to read and play scores and which simply allow a player to control performance [4]. Secondly, we can make performancerelated handles available during the process of digital music editing [2]. Thirdly, we can use this data to drive prediction algorithms to compensate for communication link timing lags and drop-outs in distributed rehearsal situations [1]. This paper introduces an altogether new approach to interpreting the element of not-the-piece, proposing for performance a model built on the idea of "~narrative." Based on premises derived from Gestalt thinking, we propose a new modality for the presentation of performance-related data the sense of touch and motion (called haptics). Imagine your hand being guided by a puck through a space as you listen to a performance of a piece. In that space, compass points represent 8 past performances. As you hear the new performance, its closeness to each of these past performances is represented by the puck pulling your hand toward that compass point. Below we describe our development of just such a display. We can provide haptic feedback or interactive forces concurrently with audio playback, thereby creating a tool for the analysis of musical performance and performance measures. The question in designing this or any virtual environment which uses haptic display to convey timevarying information is how to glue objects or sequences of guided movements together. What we seek is a haptic equivalent to telling a story, a haptic narrative. Consistent with ideas in Gestalt psychology, especially those put forward by Gibson [3], impressions of the outside world can be essentially amodal, not associated with a particular sense. Most objects in the environment give rise to multimodal experiences. Our memory of a story is not dependent upon whether we read or heard it first. W~hat we are endeavoring to discover in this present study is to what extent we can exploit our ability to abstract information from its mode of representation. Can we create a new haptic interaction with a piece that has nothing to do with how it is played physically but instead tells us something else about its performance? At the heart of our work then is a hypothesis that, like real objects, pieces of music can have an internal representation which is independent of the senses by which they are first perceived. Further, since both haptic and audio information are gathered sequentially, they must share cognitive processes for constructing hypotheses based on the perception of events which unfold over a period of time.

Page  00000002 In the sections which follow, we will briefly discuss previous work which has enabled us to develop our present hypotheses. Some theories from the field of cognitive psychology will provide us with a framework within which our concept of narrative can be developed. Finally, we will show that this concept of narrative can form the basis of a haptic display which will allow us to design experiments to test the hypothesis using data about past performances. 2 Background In a previous paper [2], we describe a method for displaying music to the haptic senses, the senses of Taction and Kinesthesia, which projects timing and amplitude information about two performances onto a virtual wall. The wall's apparent stiffness is modulated by a parameter derived from note-onset and note velocity data obtained from two performers playing the same piece. By pressing against this wall, it is possible to build up an impression of the way in which local note groups and even more large-scale phrases were articulated by each performer. The success of this experiment lay primarily in the discovery that we could directly map the performer's manipulation of musical tension to the tension or stiffness of our virtual wall. The building and relaxation of musical tension was directly related to the changes in compliance of the wall over time. Leveraging off this work, we began to build on two discoveries. Firstly, we had found that, for piano music at least, we could control some elements of performance which are independent of the score but which are consistent for each player. Secondly, we realized that what we had essentially done was to substitute one haptic narrative for another. We had replaced the haptic feedback from the piano with feedback from another instrument: one whose feel had no direct correlation to piano technique. We further realised that we now had a way of displaying phrase articulation to the haptic senses which required no knowledge of the feel of the instrument playing the music, and we began to develop the hypothesis that there could exist a thread of haptic narrative which could be exploited in the design of a new instrument. Like Max Matthews' Radio Baton [4], it would require no knowledge of instrumental technique. But unlike the Radio Baton which allows you to control the narrative element of MIDI playback by translating gestures into timing and loudness controls, it would take your hand on a tour of past performances, turning timing and loudness variations back into physical gestures. 3 Internal Representation Central to our hypothesis is the idea put forward by Gibson that we build up an impression of the world around us by combining information gleaned from many senses [3]. A flower, for instance, is not simply defined by how it looks - it has a texture and a scent which are as much a part of its identity as its colour. White [5] took this one step further. He claimed that we build from abstract perceptual cues a model of our environment and, even though this model is a stylized and simplified abstraction, we believe it represents the truth. White called this tendency "distal attribution." Now suppose we take these ideas and apply them to music. We have a single musical object, the piece. We can come into contact with a piece in many ways we can hear it, look at a score, learn to play it, and so on. It is always the same piece, even when it is being whistled by a passer-by on the street. It possesses an integrity independent of its representation. We have built an internal mental model which may or may not correspond to how we first experienced the piece. We can also build a hierarchical model with the piece in its most abstracted form at its top. Below this are various representations- the score, a recording, the musician's internal model of the piece as they learned to play it, and the listener's impression of a performance. Even further down this tree, branching from the performer, there are the various internal representations they have used to enable them to recall the music: visual memory (memory for the appearance of the score), kinesthetic memory (the memory of how the piece should feel to play), musical memory (an internal audio representation of the music), and maybe some theoretical framework which has enabled the performer to come to an understanding of the piece's structure. When performing, the musician must draw on some form of internal representation of the music, but which representation they use is not clear; probably they do not know. Whatever representation they use, their aim in performance is the same - to tell the story of the music based on their understanding of the piece to date. What this hierarchy illustrates is that, no matter on which level you experience a piece of music, you take away an impression of the musical object that is independent of your point of contact with the piece. Listeners and players both build internal models which tell a story and it is, we propose, this narrative path through the piece which forms the basis for the independent representation of the musical object.

Page  00000003 4 Displaying the Narrative In using haptic display to convey information about performance, we take advantage of a pre-existing connection between haptics and music: the performer/instrument interaction. Most instruments require the player to come into physical contact with a mechanism of some sort, thus the translation of gesture into sound involves two sensory modalities the musician perceives the sound of the instrument but also relies on its mechanical response for information regarding the results of their actions. An important component of playing music therefore is the continuous interaction between player and instrument via the haptic senses. One way to think of a person's interaction with our display is that, while they are performing tasks, they are being performers and should be provided with the kinds of sensory cues that pertain. When they are exploring a space, however, they need different cues for remembering where they have been just as the listener to a performance takes away a very different map of the piece they have heard from that which the performer has built for themselves. Both models rely heavily on past experience and both have strong narrative elements - the performer tells a story, the listener interprets it. How much the listener remembers and what in particular they are able to recall is of great interest here because it will determine how successful the performer has been in telling their tale. How does this narrative element relate to our display? Imagine three performances of a piece. The first is a playback of a MIDI file containing exact note and timing data with constant key velocity. The second is an In your face interpretation where the performer has one idea which they forcefully project like bringing out the top note of each arpeggiated figure in the piece by playing it louder. In the third performance, the performer wishes to engage the listener by telling a story - they articulate phrase structure and add tiny amounts of temporal variation which continuously force the listener to change their hypothesis about where the piece is going and to become involved in the story the performer is telling. What is the equivalent in designing a haptic space for exploration? One could say that the computerized rendering of the piece is like dropping someone in a haptic space and leaving them to search it and build up a mental model for themselves of its features, somewhat like exploring a sculpture gallery where art works have no labels and there are no guidebooks. The In your face performance might be like taking the user around the space in a pre determined path which does not leave them room for any exploratory interaction, somewhat like being rapidly shepherded through the art gallery by a tour guide. The engaging performance might be like bringing someone into a haptic space and guiding them to particular sites of interest but then allowing them to explore these at their own speed, always being free to integrate them with the rest of the haptic environment - providing them with a guide book to explore the art gallery and labeling things clearly. In other words, we would ideally seek to extract the narrative-defining elements from a performance and convert them into guided motions of our haptic puck. If we are successful, the audience, in this case the person holding the puck, should come away with a story that tells them about which of a set of past performances the one they are listening to is most like, for any point in time. What we offer is a programmable relationship between haptic objects and musical objects. This relationship is possible because both share a cognitive process the creation of a narrative to connect sequential events to form hypotheses about their relationship to each other. 5 Our Haptic Display Our haptic display device is the Moose, a two-degreeof-freedom planar device developed at CCRMA by Brent Gillespie. The Moose is comprised of two linear voice-coil motors connected to a puck or manipulandum by two perpendicularly oriented double flexures. The puck's position is tracked by two linear encoders and the whole is interfaced with a Pentiumn via a simpplee Digital I/O card. As the user moves the puck, forces can be exerted on their hand by the motors. The playback of music is achieved through MIDI. MIDI data is output via a MIDI interface to a Yamaha Disklavier in real-time. It is worth noting here that, by using Haptic display to convey information about past performances we take advantage of a second channel of information exchange leaving the auditory channel available to monitor the new performance being played through the system. 6Software Our haptic environment runs under DOS and is interrupt driven which ensures that position sensing and force output are constant. We are also able to leverage off this accurate time-keeping to precisely co-ordinate output of forces to the Moose with the output of MIDI data to the Disklavier.

Page  00000004 Our software, written in C++, draws upon a previously developed library of haptic objects to represent components of our data. Our environment has two distinct modes. These are: 1) Tracking mode. Here a new performance is tracked continuously as it plays. Musical information obtained via MIDI [1] is transmitted via the MIDI interface to the Disklavier. At the same time, vectors obtained from the statistical classifier which determine the closeness of this performance to any previous performance are displayed as forces on the Moose puck which drag the user's hand toward that performnance. 2) Exploration mode: Here the user's hand is no longer guided, but is free to explore the twodimensional workspace of the Moose and to feel where each performance resides. Past performances are represented by "poles", virtual pillars which are located on the circumference of a circle. By touching one of the pillars it is possible to audition its associated performance. Furthermore, the virtual springs which link each past performance to the new performance and cause the puck to be pulled around the workspace in tracking mode are tangible as grooves running from each performance location to the place where the puck stopped. These links can be broken, if desired, preventing their associated performance from influencing the puck. The transition between modes is a simple toggle which acts like a pause control, taking up where it left off when mode is switched back to tracking. A further advantage in using haptic output is our ability to exploit certain binarisms which are common to both our data representation and our haptic senses. The vectors which we receive as the output of our classifier describe how close or distant two performances are from each other within their parameter space. Since closeness and distance have direct correlates in haptic perception, we are able to take advantage of a binarism which exists in two modes we can move toward and away from a performance in our haptic workspace. 7 Conclusions and Future Directions We have derived from the principals of Gestalt thinking, from the principals of amodal representation and distal attribution, the concept of a piece of music that can exist as an object independent of its mode of representation. Using this concept, we have developed the hypothesis that pieces can exist as representationindependent objects because our interpretation and understanding of how they work is based upon constructing a narrative to connect their components as sequences of events in time. The performance, the not-the-piece element of music, is, we have proposed, the primary narrative-bearing element in realising a musical score. Analysing performance, therefore, depends upon being able to access this narrative element. One way of accessing narrative is to display performance data to a sensory modality which shares with music cognitive processes for interpreting sequences of events in time. The modality we have presented here which meets these conditions is that of Haptic Display and we have presented preliminary work on a system for performance analysis based on this technology. Our primary objective now is to design some simple experiments which will allow us to discover to what extent musical and haptic narrative can interact. We are interested in discovering, for example, how we memorise music. In particular, what is the cross-modal interaction which allows us to recall both what a piece sounded like and what it felt like to play. We feel that understanding how musicians learn may provide some insight into haptic memory which will one day be of use to instrument designers and designers of virtual environments. 8 Acknowledgments The author wishes to acknowledge the contributions of Brent Gillespie to this work. References [1] Chafe, Chris. 1997. "Statistical Pattern Recognition for Prediction of Solo Piano Performance" Proc. ICMC, Thessaloniki. [2] Chafe, Chris, and O'Modhrain, Sile. 1996. "Musical Muscle Memory and the Haptic Display of Performance Nuance." Proceedings of ICMC, Hong Kong. pp. 428-431. [3] Gibson, J. J. 1962. "Observations on Active Touch," Psychological Review. Vol. 69, pp. 477 -491. [4] Mathews, Max and Pierce, John. eds. 1989. Current Directions in Computer Music Research, MIT Press, Cambridge, Massachusetts. [5] White, B. W, Saunders, F. A., Scadden, L., Bach-y-Rita, P., and Collins, C. C. 1970. "Seeing with the skin.", Perception and Psychophysics, vol. 7, pp. 23-27.