Page  00000001 THE PHASE PROJECT: HAPTIC AND VISUAL INTERACTION FOR MUSIC EXPLORATION Xavier Rodet, Jean-Philippe Lambert, Roland Cahen Ircam, 1, pl. I. Stravinsky 75004 Paris, France Xavier.Rodet@ircam.fr ABSTRACT Florian Gosselin CEA-LIST Route du Panorama BP6, 92265 Fontenay aux Roses Cedex, France, florian.gosselin@cea.fr Pascal Mobuchon Ondim, 14 rue Soleillet, 75020 Paris, mobuchon@ondim.fr PHASE is a research project concerning multi-modal systems for generation, handling and control of sound and music. Some of the research results have been implemented in an interactive installation, a musical game integrating various metaphors. This installation, open to the public at the G. Pompidou museum for three months, was an extraordinary success showing the validity of such a device and revealing many possibilities of gestural music control. 1. INTRODUCTION The objectives of the PHASE project are to study interactive systems where haptic, sound and visual modalities cooperate. The integration of these three interactive modalities offer innovative capabilities, for combining gesture with force and tactile feedback, 3D visualization, and 3D sound. The objectives are scientific, cultural and educational: -- To study the cognitive processes implied in musical exploration through the perception of sound, gesture precision, instinctive gestures and the visual domain, -- To study new forms of interactivity (haptic interaction, 3D sound and 3D visualization in real time), -- To propose new sorts of musical gestures, and to enrich the methodology of training of technical gestures, -- To improve characterization of sound in relationship to music and to offer a new kind of listening, -- To evaluate new methods of sound generation with respect to a public of both amateurs and professionals, -- To propose a new form of musical awareness for a large audience, specialized or not. A final objective was to test such a system, including its haptic arm, in real conditions for general public and over a long duration, especially to assess its interest for users. A demonstration of some of the PHASE results was thus presented and evaluated at the G. Pompidou museum. It is an interactive installation offering the public a musical game. Different from a video game, the aim is to play music and to incite musical awareness. The fine and delicate nature of musical composition or instrument playing tend to favor sensitivity and dexterity over maximum force capabilities or large workspace [6]-[9]. Therefore, a haptic interface (Virtuose 3D 15-25, Fig. 4) has been designed and built by the Haption firm to allow natural and intuitive manipulation and exploration of sound and music (motion, effort, position resolution, force resolution and global stiffness around 200mm, 15N, 0.1mm, 0.5N and 1500N/m respectively). 2. GESTURE INTERACTION WITH MUSIC 2.1. Gesture, music and metaphors Studies were carried out to measure the relevance and the properties of some gestures in the context of musical control. Playing a traditional instrument is one example, but, more generally, we explored which freedoms could be left to players to express themselves in a given musical space, in order to produce music at a satisfactory expressive level. Several expressive parameters include rate, rhythm, timbre, tonality, etc. The user's gesture is carried out in the real space (by the hand). Feedback from the system (haptic, sound and visual) influence the players gesture. For example, haptic feedback can be used as a guide to help the user play a musical instrument, which requires long practice. A scenario of the course, as in the installation's game, can also facilitate training. Contrary to vision and hearing, the haptic feedback loop is direct because gesture and force feedback are localized in the hand of the user. Coherence of the various modalities between the real and virtual worlds must be guaranteed. Therefore, we defined a metaphor (Fig. 1) as the link between the real world where the hand acts, and the virtual music world, where the avatar acts. Several types of metaphors were tested (section 4). Figure 1. Interaction metaphor 2.2. Experiments on music interaction Let us give two metaphor examples combined in the installation. The first one describes a needle inside a vinyl disc groove. According to whether the disc is spinning or not, various temporal controls are obtained. The temporal pace is controlled directly if one holds a needle and moves it along the groove. If time moves along in space, one can control the relative reading speed as if the vinyl disc were spinning. One can control the speed of the disc, i.e. the course of the music, which is recorded on the disc. Music could be recorded not

Page  00000002 physics engine speed in the global reference acceleration in the global reference new contact pressure penetration relative position of the object in contact relative speed of the object in contact relative acceleration of the object in contact contact point position. in the object reference contact point speed in the object reference contact point acceleration in the object reference end of contact behaviors shock pressure friction movement loss of conitact sound engine riv,-i^~ > 1-P~ "^1c resonances models granular synthesis sinrsoids combination samples form:ants Figure 2. Mapping physical objects, visual objects and sound objects with a behaviour layer. only just in the groove but in its vicinity as well. In another metaphor, objects defined in the physics engine have fixed characteristics (size, mass, etc.) and particular temporalities and dynamics. Objects are attributed some behaviour, and visual and sound appearances are associated to them. This mapping strategy [1] uses several layers to connect physical objects, visual objects and sound objects (Fig. 2) 2.3. Playing music Composer R. Cahen has been deeply involved in generating and handling sound and music [2]. Different ways of using gesture for playing music have been studied: the player can have access to rhythm, timbre, tonality, etc., and different gestures can be used for this control. Different modes of playing have been identified: positioning oneself in music, browsing music, conducting music, playing music as with an instrument, etc. Different kinds of music can be played: well-known music can easily be identified even when it is transformed, and user control of a temporal position is easy. But such fixed music is not adjustable. On the other hand, specific music can be composed so as to be modified by players. Furthermore, music processes and generators can be designed so as to produce music which interacts with players. Different musical interactive tools have thus been built, allowing one to: -- Manipulate musical phrases, specifically created and easy to transform. -- Change musical parameters in "play" time or/and real time, for creating different variations. -- Generate a staccato flow enabling numerous programmed and interactive variations close to real time. Some work on recorded or generated sound material has also been done in PHASE using granular synthesis which offers access to time control, allows the music to be re-played in different ways and for transposition to be independent from time position and speed control. For example, one can directly control the temporal development, or control a relative tempo, with gestures. One can manipulate short-time musical structures modifying a time-window in loops or as a stream. Another approach consists of constructing a virtual physical world and listening to it in different ways. 2.4. Sound Navigation Sound Navigation consists of browsing through different sound sources placed in a scene to compose a musical form from the mix produced by movements and interaction with sources. The player can freely choose between a simple visit, play with sound behaviours or perform a personal expressive path. The choice of a musical implementation of sound sources related to their interaction metaphor is determined by the musical functions attributed to them in the composition. As the player can choose to trigger different sound behaviours in his/her own choice order, the horizontal construction is determined by the environment interaction setup. For example, two sounding objects in different places cannot be hit at the same time, and there is a speed limit when hitting the two objects successively. 2.5. Gesture and haptic feedback With the help of F. Bevilacqua, a study has been done to compare a graphic tablet and a force-feedback arm, to identify significant parameters for real-time sound and music generation, and to evaluate the limitations of the graphic tablet. Subjects listen to and represent musical phrases (recorded from a marimba) three times and then trace a representation of the music movement, three times with each device. They can freely choose a representation but have to stay coherent for the six traces. A simple horizontal plane is simulated with the arm. Representations were recorded and informally analyzed. As expected, the absolute position and to a lesser extent the direction of gestures vary largely in the absence of a haptic guide. Averaged absolute speeds and accelerations appear relatively similar for the two devices, for the same subject and the same sentence. The arm designed in the project proved to be much better than previous ones because of its smaller inertia. Rather good coherence between various sentences for the same subject are observed, as well as between subjects in the case of short sentences. Representations are well correlated with phrase structure (like in [3]). Furthermore, the law of Fitts [4] and its derivative [5] have been checked on the representations. This leads to the idea of using the various characteristics bound by the law (briefly, speed and radius of curvature) independently, i.e. in contradiction with the law, for purposes of expressivity: the user can choose to violate the law by using a smaller or a larger curvature radius than the speed of his hand would imply. The expressive possibilities of this principle proved to be very interesting in some musical sketches. 3. MULTI MODAL INSTALLATION The developments made in the PHASE project led to a multi-modal installation intended for the public in real conditions, presented in the Centre Georges Pompidou for three months and used by thousands of children and

Page  00000003 elderly people. While the music of video games is put at the service of the visual or of the interactive scenario, our research was devoted to the use of the three modalities for musical awareness: how to sensitize a general public to new listening and musical creative modes. To answer this question, a simple scenario, based on the metaphor of a vinyl disc groove has been conceived with a very intuitive beginning [10], [11]. It allows for many possibilities of interaction between the two heads and the human player (Section 4). The installation is made up of (Fig.3): a computer (RTAI-Linux) for the haptic arm; a computer (Windows XP) to run physical interaction simulation (Vortex), scenario and graphics software (Virtools) and Ethernet communications (OSC [12] for all audio); two Macs-OS X for sound and music generation (MAX/MSP) and 3D sound (Spatialisateur); eight loudspeakers and two video projectors for 3D sound and 3D visuals (Fig. 4). Since the installation was to be used by the general public for a long duration, particular care was brought to its safety and reliability. To allow for an elbowsupported manipulation an ergonomic desk was designed at CEA-LIST (Fig. 4). It enables a precise and comfortable use without fatigue, and an immediate understanding of the workspace of the arm. 4. MUSICAL CREATION IN THE INSTALLATION 4.1. Metaphors In the PHASE installation, we have mainly developed 3 metaphors: obstacles, zones, and music groove, which can play together or separately. We decided that they should have a good sound behaviour independently and a complementary orchestral function when they play together. Moreover, their sound behaviour has not only esthetical functions but also informative functions, such as saying whether the player's avatar is touching the groove's bottom, or the ground, informing the player that a new obstacle is coming soon when visually hidden, indicating excessive pressure on an object, etc. 4.2. Obstacles: direct interaction Direct interaction is close to reality. Physical objects are also sound sources. The interaction between user and objects is similar to the play of a musical instrument. Obstacles are objects in the game space with various shapes. They are separated into two types: fixed and mobile obstacles. They have several different sounds. Musical choice: Each kind of obstacle has its own sound but sound behaviour varies largely. Approach uses groups of sinusoids with close frequencies and filtered noise for wind effect. Hitting depends on the size of objects, hardness, etc. Pressing on objects indicates elasticity, granularity, etc. Scrubbing resonates in the same way as hitting. Musical result: The main goal is to create non-realistic but credible sound effects, which would fit imaginary objects. Different from movie sound effects, these sounds are always changing, which implies a complex model and complex results (unfortunately, physical models are still too expensive). Most signal models would become redundant after three hits, but in our implementation, variations allow the possibility of playing with each object for a much longer time. MAXkiMSP | MAX/M\SP auLdio aDAT " it.IacOSvX '" OSC over U:,:acOS X.I UDP |A? | i V Lrtex..... 'oV l eur \ P Virtese _____________________ -RT. A:.l.i.u x Figure 3. Architecture Figure 4. Haptic arm in the installation 4.3. Zones: divertimento variations Zones are musical playing spaces on the sides of the groove, represented by grass moving when touched. They allow musical real time control on animated sound processes according to position inside the zone, speed of input in the zone, pressure and velocity of movement. The musical choice consists of short sample loops that are shaped in relative duration, position, pitch, etc., according to interactive user controls. Technical implementation: Using a cluster of groove-i, controls are mapped to change continuously the low and high group limits of each parameter. All the other sample instances are interpolated between these limits. Spatialization is point to point, each instance corresponding to a specific track. The musical result of the zones metaphor is probably the most interesting one. One can play with the various zones, taking advantage of the precision of the haptic arm. Each zone is different, giving a wide range of sounds and musical behaviours. 1 Max/MSP object for playing and looping samples at variable speed

Page  00000004 4.4. Music groove: main clue The Writing Head (WH), is a bright blob of light, running in the groove at some distance in front of the player. It generates and plays music according to player's actions and position in the scenario. It writes the music into the groove, like that of a vinyl disc, in the form of a visual and haptic trace made of colour splashes corresponding to notes. The head (RH): The player holds the RH, another blob of light pursuing the WH when placed inside the groove. The RH replays the music after a delay that is a function of the distance between the WH and the RH. The groove scrolls when the RH is inside it, so that the player controls indirectly the speed of the re-played music. To follow or even catch up the WH, the player must follow the music as exactly as possible along the trace. The choice of this metaphor is justified by: -- An analogy with the real world: a vinyl disc groove -- A well-known game mode: a race -- A musical principle of re-play, close to a canon -- Facility of gesture: the groove is a guide for the player. Musical choice: The player listens to and plays with the WH, a standard musical situation, except that tempo varies continuously with the relative speeds of the two heads and the delay between the two voices is changing, so that there is neither harmonic nor rhythmic synchronization. The result is like an imitation in echo, more or less catching up. It was found important that the music could be identified as a continuous omnipresent character and to distinguish the two voices. Two kinds of musical approaches appeared interesting: -- Very expressive lines like singing voices. -- A rhythmic interplay, like a master and a student. The musical implementation is an ostinato generator, with three voices: attacks, tone and bass. The tone follows modal constraint, the attacks, vary according to irregular velocity hits among chosen tempo divisions. The bass is either preceding the tone or following it when changing pattern. The number of tempo steps changes from one period to another. The generator (fluidsynth~2) plays instrumental sounds (steel drum, prepared piano, cithara...) and other short samples. Then the WH sound passes though a gabor~3 delay line and is replayed in granular synthesis, changing reading speed without changing pitch. The result is a rich matter with many nuances, coupled with gestural control. As an example, it is particularly interesting when the player (RH) is suddenly stopped by an obstacle just before catching up the WH: in this case, the runaway of the WH sounds very clear and jubilatory. 4.5. Game scenario and music Music and player actions are strongly attached together in the scenario, i.e. the score is the game or it's rules and the interpretation comes out of the player's involvement. The musical result is then dependent on the scenario and on the physical world. The mode of competitive play with the WH is involving for the player but is not imposed. At any time, it is possible to stop, leave the groove and to play with sounding objects or zones of interaction. This helps immersion and sound and musical awareness, by the richness of sonorities and 2 Soft sampler by Peter Hanappe and Max/MSP object by Norbert Schnell 3 Granular synthesis Max/MSP object working in a delay line by Norbert Schnell interaction. When the player stops, the WH waits in silence and starts again as soon as the player comes back into the groove. There is no true "scoring" of the player's performance, but his progression is evaluated in order to adjust the behaviour of the system: a player having difficulties and a contemplative one are different, but in both cases the system slows down, thus facilitating the usage of the system. 5. CONCLUSION The installation open to the public at the G. Pompidou museum for three months was an extraordinary success (around 20.000 visitors). While a user was playing, "interpreting", or skimming through R. Cahen's composition, the performance was recorded on a CD given to him/her at the end of the session. The public was enthusiastic, underscoring the interest, the innovation and the pedagogical aspects of the project. The PHASE research project encompasses the use of gestural control with haptic, visual and sound feedback, with the aim of playing music. Hardware, software and methodological tools were built to allow for the development and the realization of metaphors having a musical aim. These developments are now available and usable. Phase opens the way to many possibilities of gestural music control and opens a new direction in music composition. 6. REFERENCES [1] A. Hunt, M. Wanderley, R. Kirk, "Toward a Model for Instrumental Mapping in Expert Musical Interaction", Proc. of the Int. Comp. Music Conf, Beijing, Oct. 1999 [2] R. Cahen, "G6n6rativit6 et interactivit6 en musique et en art 6lectroacoustique", 2000, http://perso.wanadoo.fr/roland.cahen/Textes/CoursSON.ht ml/Musiquelnteractive.htm [3] C. Cadoz, M. M. Wanderley. Gesture - Music. In M.M. Wanderley and M. Battier, eds, Trends in Gestural Control of Music, Ircam - Centre Pompidou, 2000. [4] P. M. Fitts, "The information capacity of the human motor system in controlling the amplitude of the movement", Jour.1 Exper. Psych., 47, 1954, pp. 381-391. [5] J. Accot, S. Zhai, "Beyond fitts' law: Models for trajectory-based hci tasks", In ACM/SIGCHI: Conference on Human Factors in Computing Systems (CHI), 1997. [6] T.H. Massie, J.K. Salisbury, "The PHANToM haptic interface: a device for probing virtual objects", Proc. ASME Winter Annual Meeting, Symp. Haptic Interf. Virtual Env. and Teleoperator Syst., Chicago, Nov. 1994. [7] D.A. McAffee, P. Fiorini, "Hand Controller Design Requirements and Performance Issues inTelerobotics", ICAR 91, Robots in Unstructured Environments, Pisa, Italy, June 1991, pp. 186-192. [8] G. Burdea, P. Coiffet, "La r6alit6 virtuelle" (in French), Hermes Publishing, Paris, 1993. [9] F. Gosselin, A. Riwan, D. Ponsort, J.P. Friconneau, P. Gravez, "Design of a new input device for telesurgery", World Automation Congress 2004, Proc. ISORAO4 10th tnt Symp. on Robotics and Applic, June 28-July 1, 2004, Seville, Spain, Paper ISORA 120 [10] R. Cahen, "Navigation sonore situ6e", 2002, http://perso.wanadoo.fr/roland.cahen/Textes/CoursSON.ht ml/Musiquelnteractive.htm [11l] V. Gal, C. Le Prado, S. Natkin, Liliana Vega, "Writing for video games", Proc. Virtual Reality Int. Conference (VRIC), Laval, France, June 2002. [12] M. Wright, A Freed, "Open Sound Control: A New Protocol for Communicating with sound Synthesizers", Proc. Int. Computer Music Conference (ICMC), Thessaloniki, Greece, September 1997.