Page  305 ï~~EXTRACTION OF CONDUCTING GESTURES IN 3D SPACE Forrest Tobey Ichiro Fujinaga Peabody Conservatory of Music Baltimore, MD 21202, ABSTRACT The authors' conductor-following system, previously designed to provide rubato control of overall tempo through continuous baton tracking on the x-y axis, has been expanded to extract control data along the zaxis as well. This data is used to interpret a conductor's expressive intentions, allowing the system to effect real-time control over such musical parameters as dynamic fluctuations, styles of articulation, and timbral balances. Thus the addition of data from the z-axis (movements away from and towards the body) has the potential for capturing aspects of a conductor's emotional interpretation of a work of music. Furthermore, the system's learning capability, that is, its ability to adapt to different conducting styles is also greatly enhanced. 3D conductor tracking is the next step in developing a fully integrated system for the recognition and interpretation of a conductor's gestures. The applications for such an enhanced system are numerous: student conductor training, conducting style analysis, general expression analysis, conducting hybrid human / computer ensembles, and performance of compositions written for the 3D system. Data input for the system currently employs two Buchla Lightning cameras and a pair of wands. 1 Capturing Musical Expression by Tracking a Conductor's Gestures An important trend in the field of interactive computer music performance is the tracking of human musical performance gesture and the use of the resultant data as control sources for the triggering and shaping of electronically generated sounds. This trend encompasses the tracking of both traditional performance gesture on acoustic instruments (the shakuhachi flute, for example, in [Katayose, 1994]) as well as the tracking of performance gestures that are specific to a variety of new electronic performance devices (i.e. the Buchla Lightning [Rich, 1991], the Radio Drum [Mathews, 1989], the Video Harp [Rubine, 1988] and the Hands [Waisvitz, 1985]). What links together the tracking of traditional and nontraditional performance gesture is the emphasis that is placed on larger physical motions, that is, on motions that are not specific to the generation of individual pitches. In the shakuhachi example, the motions of the head that forms a central aspect of the performance practice are tracked in addition to the placement of the fingers on the sound holes. All the electronic instruments cited above, while capable of generating individual pitches similar to acoustic instruments, are best employed when larger gestures of the hands and arms are used to control musical processes. The traditional role of the conductor performing with an orchestra can, in this regard, be viewed as similar in practice to these new electronic control instruments. The conductor does not play the actual notes that are in the score. Instead, he controls the timing and shaping of the score through an established grammar of human arm gesture. This gesture grammar is predictive in nature, and communicates to the player four essential aspects of musical performance: when he is to play, the type of attack he is to use, how he is to continue playing after his entrance, and when he is to relinquish control [Prausnitz, 1988]. In tracking the gestures of a conductor, we are tracking motions that communicate the expressive and interpretive aspects of a musical performance, resulting in data which digitizes a specific type of musical expression. 2 The Communicative Nature of the Conductor's Baton The conductor communicates to the orchestra in three essential ways: with the baton, with the left hand, and with the eyes. Of these three, the use of the baton is by far the easiest to codify, and the authors have endeavored to extract as much information as possible solely through tracking the motion of the baton. Through the use of the baton alone, the experienced conductor can effect subtle variations in expressive performance. By varying the velocity and acceleration of the baton between ICMC Proceedings 1996 305 Tobey & Fujinaga

Page  306 ï~~beat points, the conductor can achieve a fluctuating rhythmic flow and also indicate specific styles of attack. By changing the size of the lift and fall of the baton and by controlling its relative distance away from the body, the conductor can indicate distinct levels of dynamic contrast. By pointing the baton to various parts of the orchestra, the conductor can affect continuous adjustments to the timbral content of the orchestral balance. We have based our conductor-following system on the knowledge gained from a detailed study of baton technique. The result is a system for gestural control that reflects an established performance practice developed over centuries. In analyzing the conductor's craft, it was recognized that the baton communicates to the orchestra not just through its movement in the vertical and horizontal dimensions, but also in its movement away from and towards the conductor's body. In order for the field of conductor-following to truly realize its potential, it has proved beneficial, therefore to expand the field of vision by viewing the motions of the conductor's baton along the z-plane as well. 3 Acquisition of Data in Three Dimensions Four previously presented systems for conductor following [Mathews, 1989; Keane, 1990; Morita, 1989; Carosi 1993] have emphasized the identification of the beat points of the conductor's baton (changes in vertical direction) and the resultant control over the timing of a sequential MIDI file. Other researchers have emphasized a continuous tracking of the conductor's baton, either through the use of neural nets [Garnett, 1994; Zeungnam, 1992] or by the mapping of fluctuating wand velocities to resultant shifts in tempo [Tobey, 1995]. These researchers have collectively demonstrated the success of using a variety of specialized conductor's batons interacting with custom designed software in order to control the basic musical parameters of tempo and, to a lesser extent, dynamic levels through the use of a two dimensional conductor-following system. Conducting, however is a three dimensional art. We have therefore added a second Buchla Lightning camera to our system, resulting in a simple but immediately effective method for conductorfollowing in three dimensions. One camera, facing the conductor, reads in continuous controller data on the horizontal and vertical axis. A second camera, positioned to the side of the conductor, tracks the movements of the wand away from and towards the conductor's body. The three sets of data are dynamically linked in software. In fact, this adaptability of the Buchla Lightning for working in three dimensions has much to recommend it as the "baton of choice" for future conductor-following development, especially since the device is subjected to periodic upgrades and its users can rely on good technical support from its creator. (The authors are currently using the Lightning II). Also, we have found that the position of the wand can be reliably sampled at twenty millisecond intervals when each zone is set to a distinct continuous controller output (resulting in a "grid" of 256 X 512 points), a sample rate equal to or better than systems that employ video acquisition. 4 Interpretation of Data in Three Dimensions In order to test the conducting system, we have modeled the real-life situation of a conductor working with an orchestra. Common orchestral literature, converted to standard MIDI files with enhancements to indicate the beginnings and endings of phrases, has been employed as an effective method for testing the interpretive abilities of the conductor-following algorithms. This has resulted in the following levels of real-time control: Tempo Control Through continuous baton tracking, changes in the acceleration curves between beat points are recognized as indications for fluctuations in overall tempo. A sudden decrease or increase in acceleration during the lift and fall of the baton, when compared to the previous series of beats, will result in smooth changes in the tempo of the performance. We have found, in fact, that there is no need to place in the score-file itself any indications as to tempo or tempo changes. The file needs only to reflect the relative note values in relation to the conductor's beat, in emulation of the ensemble player who begins a performance with the knowledge of how many beats he knows the conductor will be giving in any particular measure. The system will adjust the speed of the output depending on the tracking of the conductor's baton, without any need to refer to a tempo map or a predetermined tempo indication. Dynamic Control Tracking in three dimensions has allowed for a much more realistic control over the dynamic levels of the performance. While there is no absolute method for indicating dynamics to the orchestra, we have chosen to use the spatial location of the beat points in relationship to the distance from the conductor's body as an indication for overall changes in dynamic levels. In fact, just as in tempo control, there is no need to indicate dynamic levels Tobey & Fujinaga 306 ICMC Proceedings 1996

Page  307 ï~~in the score. The preparation of an orchestral MIDI file will necessarily include note-on velocities, which will be specific to each orchestral sample chosen, but once the overall velocities are established, the general control over orchestral dynamic levels is completely left to the conductor's discretion, as controlled by the placement of his beat points in relationship to his body. Beat Pattern Recognition Data from the x-y plane is used in our system to find the basic beat pattern at any given point in the score, with special attention paid to a recognition of the downbeat of each measure. This information can be of use in performance, as clearly placed downbeats can serve to confirm to both human and computer performer the current place in the score. Beat Style Recognition An important aspect of conducting lies in the "style" of the conductors beat, best defined as the relative angle of the wand as it moves towards and away from the beat point. A smooth curve generally results in a legato response from the orchestra, while a more angular beat indicates a staccato or marcato style of performance. We have found that beat style recognition can be mapped to various real time control aspects of the sampled or synthesized sound, resulting in continuous control over a given sound's attack onset and evolution. Accentuation Control Tracking in three dimensions has also increased the potential for achieving control over individual attack velocities of notes at any given point in a performance. A common practice among orchestral conductors is to prepare a sforzando attack by bringing the baton close to the body, followed by a sudden fall downwards and away. The use of the zplane for orchestral attacks makes this level of dynamic control quite reproducible. Timbral Balances Finally, a certain level of timbral control over the orchestral sound can be achieved by tracking the angle at which the baton is pointing and mapping that angle to certain orchestral instruments. While this has achieved a limited success, it is clearly at the level of timbral balance and orchestral cueing that the need for a left han~d input and control is called for. Adding left hand control through some type of data glove will be the next topic for our research and development. 5 Applications We have found that our system for following the baton of a conductor to be a robust system for both virtual orchestral conducting and for conducting hybrid computer-human ensembles in live performance. The use of the virtual conducting module could be of great benefit to the student conductor, and we are investigating ways to incorporate the system into an educational setting as an aide for the novice conductor. The system has been used on numerous occasions in performance with human ensembles, and it is hoped that composers will begin to take an interest in writing works for live ensemble and conducted electronics. REFERENCES [Carosi, 1993] P. Carosi and G. Bertini. Light Baton: A system for Conducting Computer Music Performance. Proceedings of the 1992 International Computer Music Conference, 1992. [Keane, 1990] D. Keane, G. Smecca, and K. Wood. The MIDI Baton II. Proceedings of the 1990 International Computer Music Conference, 1990. [Garnett 1992] G. Garnett, M. Lee, and D. Wessel. An Adaptive Conductor Follower. Proceedings of the 1992 International Computer Music Conference. [Katayose, 1994] H. Katayose. Demonstration of Gesture Sensors for the Shakuhachi. In Proceedings of the 1994 International Computer Music Conference, 1994. [Mathews, 1989] M. Mathews. The Conductor Program and Mechanical Baton. Current Directions in Computer Music Research, Cambridge, MA: MIT Press, 1989. [Morita, 1989] H. Morita and S. Ohteru. Computer Music System which Follows a Human Conductor. Proceedings of the 1989 International Computer Music Conference, 1989. [Prausnitz, 1983] F. Prausnitz, Score and Podium. New York: W. W. Norton and Company, 1983. [Rich, 1991] R. Rich. Buchla Lightning MIDI Controller. Electronic Musician 7(10), 1991. [Rubine, 1988] D. Rubine. The Video Harp. Proceedings of the 1988 International Computer Music Conference, 1988. [Tobey, 1995] F. Tobey. The Ensemble Member and the Conducted Computer. Proceedings of the 1995 International Computer Music Conference, 1995. [Waisvitz, 1985] M. Waisvitz. The Hands. Proceedings of the 1985 International Computer Music Conference, 1985. [Zeungnam, 1992] B. Zeungnam. On-Line Analysis of Music Conductor's Two-Dimensional Motion. 1992 IEEE Conference on Fuzzy Logic, 1992. ICMC Proceedings 1996 307 Tobey & Fujinaga