Page  00000464 Jam'aa - A Middle Eastern Percussion Ensemble for Human and Robotic Players Gil Weinberg, Scott Driscoll, and Travis Thatcher Music Department Georgia Institute of Technology gilw, scott.driscoll, travis.thatcher@gatech.edu Abstract We describe an interactive performance piece titled Jam'aa for two human percussionists and a robotic drummer. Our robot, named Haile, is designed to listen to live players, analyze their drumming in real-time, and use the product of this analysis to play back in an improvisational manner. It is designed to combine the benefits of computational power and algorithmic music with the richness, visual interactivity, and expression of acoustic playing. We believe that when collaborating with live players, Haile can facilitate a musical experience that is not possible by any other means, inspiring players to interact with it in novel expressive manners. In Jam'aa, Haile listens to and interacts with two humans playing Darbukas - Middle Eastern goblet shaped hand drums. Haile listens to audio input from each drum, and detects musical aspects such as note onset, pitch, amplitude, beat, rhythmic stability and rhythmic density. Based on these detected features, Haile utilizes six interaction modes that are designed to address the unique improvisatory aesthetics of the Middle Eastern percussion ensemble. Haile responds physically by operating its mechanical arms, adjusting the sound of its hits in two manners: pitch and timbre variety are achieved by striking the drumhead in different locations while volume variety is achieved by hitting harder or softer. After a short introduction and a discussion of related work, we present an overview of the Middle Eastern percussion tradition. This leads to a presentation of Haile's mechanical, perceptual and interaction modules and their implementation in the composition Jam'aa. We conclude with information about Jam'aas premiere and future work we plan to conduct. 1 Introduction Most computer supported interactive music systems are hampered by their inanimate nature, which does not provide players and audiences with physical and visual cues that are essential for creating expressive and intuitive musical interactions. Such systems are also limited by the electronic reproduction and amplification of sound through speakers, which cannot fully capture the richness of acoustic sound. Our approach for addressing these limitations is to utilize a mechanical apparatus that converts digital musical instructions into acoustic and physical generation of sound. We believe that such a musical robot can bring together the unique advantages of computation power and the expression and richness of creating acoustic sound using physical and visual gestures. We chose the Middle Eastern percussion genre as a preliminary test bed for such novel humanmachine interactions due to its rhythmic, improvisatory and collaborative nature. These elements fit with our plan to address rhythmic human-machine collaboration before expanding our research to melodic and harmonic interactions. Our robotic prototype, named Haile, can listen to and analyze traditional Middle Eastern rhythms in realtime. Based on machine listening, algorithmic variations, and versatile mechanical actions, Haile can then respond in a rich and acoustic manner in an attempt to inspire human players to interact with it in novel manners. We hope that such collaborative musical interactions among humans and machines can lead to the creation of new music that cannot be conceived by traditional means. 2 Related Work Our work is informed by the field of Machine Musicianship, which utilizes computerized systems to analyze, perform, and compose music based on theoretical foundations in fields such as music theory, computer music, music cognition, and artificial intelligence (Rowe 1992). Several effective approaches for the design of such interactive musical systems have been explored over the years by researchers such as Lewis (2000) Cope (1996) Pachet (2002), and others. We also study current research directions in musical robotics, which often focus on sound production more than social musical aspects such as listening, analysis, group improvisation, and collaboration. Both "robotic instruments" - mechanically automated devices that can be played by live musicians or triggered by pre-recorded sequences (for example Singer, Feddersen et al. 2004; Dannenberg, Brown et al. 2005) and "anthropomorphic robots" - hominoid robots that attempt to imitate the action of human musicians (for example Takanishi and Maeda 1998; Toyota 2004) function mostly as mechanical apparatuses that follow deterministic rules. Only a few attempts have been made to develop perceptual robots that are controlled by neural networks (Baginsky 464

Page  00000465 2004) or robotic compositions that are based on listening and response such as GuitarBotana (Singer, Feddersen et al. 2004). In addition, our work is informed by studies in computational modeling of human perception of rhythm. Low-level cognitive rhythmic modeling addresses aspects such as onset, tempo, and beat detection, using audio sources (Paulus and Kalpuri 2002) (Foote and Uchihashi 2001) or MIDI (Winkler 2001). Higher-level rhythmic perceptual modeling addresses more subjective percepts such as rhythmic stability, similarity, and tension (Desain and Honing 2002). We implemented some of these concepts in Haile's preliminary perceptual modules as described in (Weinberg, Driscoll S. et al. 2005). 3 The Darbuka in the Middle Eastern Percussion Tradition In an effort to design interaction modes that would enrich the collaborative drumming experience of the Middle Eastern percussion ensemble, we investigated this tradition both in literature and in practice. Middle Eastern percussion ensembles serve as a popular form of entertainment in ceremonies and festivities, both as accompaniment for dancing and as pure instrumental manner, where virtuosi percussionists can play together with novices. The Darbuka (a goblet shaped hand drum also known as Dumbek) is one of the most popular drums in this tradition, and is often played by drummers with a wide variety of skill levels, both in recreational and professional settings. Other instruments in the Middle Eastern ensemble are the Duff (a wide wooden frame drum also known as Bendir, or Tar), and the Riqq (a hand held drum that is surrounded by metal cymbals similarly to the Tambourine). Popular playing techniques for the Darbuka are the open bass tone in the middle of the drum (the 'Dum' sound), the high-pitched rim sound (the 'Ta' and the 'Ka' sound), the 'Pop' sound that is made by playing next to the rim while pressing down on the drumhead with the other hand, and the 'Slap' stroke, which creates a loud damped sound. A more advanced technique involves fast rolls using the fingers of both hands (Go6mez 2001). Darbuka players commonly use a variety of traditional rhythms that are originated in the Middle East, from the Maghreb to Persia. Most of the rhythms are based on 4/4 meters (such as the Masmudi, Saidi, Malfuf, Cifttetali, and Maqsum). Others are based on odd meters such as the Karsilama (9/8) Sha'bia (12/8) or Darge (6/8). The skills of the players in the ensembles are often judged by how well they manipulate the original rhythm with variations through the use of dynamics, syncopation, accents, and the overlay of other rhythmic variations. Beats may be added or left out, divided, or slightly syncopated and compatible rhythms can be superimposed or exchanged. Percussive Middle Eastern compositions often include a number of different rhythms that are introduced sequentially, separated by bridges, where players play in unison. Most compositions include sections where variation is encouraged and places where players are expected to follow pre-composed and rehearsed routines. These aspects of the genre fit perfectly with our goal to create a sensitive robotic listener and improviser that can also serve as an accurate executer of precise instructions. 4 The Haile System 4.1 Mechanics In order to support familiar and expressive interactions with human players, Haile's design is anthropomorphic, utilizing two percussive arms that can move to different locations on the drum and strike with varying velocities. The first one-armed prototype (Weinberg, Driscoll S. et al. 2005) incorporated the USB based Teleo System (MakingThings 2005) as the main interface between Max/MSP and Haile's sensors and motors. The new design utilizes low latency communication and control circuitry, consisting of two Microchip PIC microprocessors that are responsible for low-level operations, and an Ethernet interface board that receives higher-level UDP communications from the central computer. The new system, therefore, allows for faster and more complex twoarm operations, which better captures Middle Eastern Darbuka rhythms and techniques. Both of Haile's arms can adjust hits' sound in two manners: different pitches are achieved by striking the drumhead in different locations, while volume is adjusted by hitting with varying velocities. The right arm solenoid operation allows for fast hits (up to 20 Hz) but cannot provide wide dynamic range and large visual cues. The left arm utilizes a linear motor that enables full control of height in addition to distance from the rim at a speed of up to 12 Hz. This arm is designed to control larger and more visible motions, and can produce wider dynamic range and more intuitive visual cues. The PIC microprocessor that is responsible for the left arm operation is capable of controlling position and velocity, executing trajectories, and pressing on the skin with precise amounts of force. Feedback regarding the arm's radial movement is provided by a potentiometer, while an encoder provides the system with feedback about the arm's vertical movement (See Figure 1). Figure 1. Haile's left arm mechanism 465

Page  00000466 4.2 Perception Haile is designed to listen to audio input via directional microphones installed on each Darbuka. Its low-level perceptual analysis algorithms address aspects such as note onset detection, pitch detection, and amplitude detection. For detecting hit onsets we use the Max/MSP object bonk~ which detects sharp changes in amplitude and spectral composition. Pitch information is obtained by analyzing the amplitudes of spectral peaks every time a hit is recorded. Through this mechanism we can detect different Darbuka strokes such as the low-pitch "Dum", the high-pitch "Ta" and "Ka," the loud "Slap" and the pitch-variant "Pop". Other relatively low-level perceptual modules that we implemented in Haile are a simple tempo and beat detections as well as density detection, where we compute the number of note onsets per time unit. Haile then utilizes the output from these modules to transform and modify its drumming in an improvisatory manner based on the dynamic input from human players. 4.3 Interaction Our main goal in designing the user interaction in Jam'aa is to preserve the important traditional elements of the Middle Eastern percussion genre, while introducing a new computational approach that can inspire human players to collaborate with Haile, and with each other, in novel manners. We do not attempt to develop robotic actions that imitate or compete with human virtuosity, rather to use the perceptual algorithmic nature of the interaction to create novel musical experiences that may lead to novel musical outcome. Our main challenge, therefore, is to combine our perceptual modules with Haile's mechanical abilities in a manner that would lead to inspiring human-machine collaborations. The approach we use for addressing this challenge is based on our model for interdependent group interaction in interconnected musical networks (Weinberg 2005). At the core of this model is a categorization of collaborative musical interactions between artificial and human players based on sequential and synchronous operations with centralized and decentralized control schemes. Informed by this theory we developed six different modes of interaction for Haile: Imitation; Stochastic Transformation; Algorithmic Morphing; Beat Detection; Synchronized Sequencing and Perceptual Accompaniment. Each interaction mode utilizes one or more of our perceptual modules and can be embedded in combination with other interaction modules in interactive compositions and educational activities. The first three modes are sequential in nature. In the first mode, Imitation, Haile merely repeats what it hears based on its low-level onset, pitch, and amplitude perception modules. Players play a rhythm and after two seconds of silence Haile imitates their input in a call-and-response manner. The robot adjusts pitch and timbre by playing closer to the rim (higher pitch) or closer to the center (lower pitch) as well as hits' loudness by controlling its strikers' velocity. In the second mode, Stochastic Transformation, Haile improvises based on human players' input. Informed by the manner in which humans transform rhythms in the Middle Eastern percussion genre, Haile stochastically divides, multiples, or skips beats in the input rhythm, creating variations that preserve the original feel of the beat. The third sequential mode is Algorithmic Morphing. Here, Haile records rhythms from different players, transforms and morphs them together through cross-segmentation of beats and measures, and introduces them later in the piece. We also implement a simple sequential beat detection mode, often used in bridges between sections. Here, a human player introduces the new tempo and beat as Haile computes the new values by averaging delta times and testing for strong beat periodicity. It then joins the human player by adjusting its recorded rhythms to the new tempo and beat. In addition to these four sequential modes, we also developed and implemented two synchronous interaction modes - Synchronized Sequencing and Perceptual Accompaniment, where humans play simultaneously with Haile. In Synchronized Sequencing mode Haile simply plays pre-recorded MIDI files, allowing humans to play with it in unison or in accompaniment. This synchronous centralized mode allows composers to feature their structured compositions in a manner that is not susceptible to algorithmic transformation or significant user input. A more advanced interaction mode is Perceptual Accompaniment, where Haile plays simultaneously with human players while listening to and analyzing their input. It then modifies its drumming in real-time based on the perceived density and loudness of the other players. The higher the perceived rhythmic density is, the sparser Haile's playing becomes. Loudness response, on the other hand, is based on direct relationship, as Haile increases the velocity of its drumming when the perceived human playing becomes louder and vice versa. In this mode Haile can also create local call-and-response "conversations" with different players, based on its perceptual analysis. This leads to a multi player collaboration as the robot responds directly to specific players based on their individual perceptual coefficients (see examples of some of these interaction modes in http://coa.gatech.edu/-gil/Jam'aaJerusalem.mov). Figure 2. Jam'aa in Rehearsal 466

Page  00000467 5 The Composition The composition Jam'aa ("gathering" in Arabic) builds on the unique communal nature of the Middle Eastern percussion ensemble, attempting to enrich its improvisational nature, call-and-response routines, and virtuosic solos with algorithmic transformation and humanrobotic interactions. The piece, commissioned by Hamaabada Art Center and premiered on March 25th 2006 in Jerusalem, is written for two professional players, a robot, and an undesignated number of accompaniment players (who only support the beat, but do not interact directly with the robot). Jam'aa begins with Imitation and Stochastic Transformation modes. Here, human players take turns in introducing free rubato rhythmic motifs. Haile repeats and transforms these motifs in a call-and-response manner. Each player then plays a metrical traditional Middle Eastern rhythm to which Haile responds with imitation and transformation in turn. In this section Haile also records the different motifs played by the human players and stores them for further use later in the piece. In the third section, players take turns listening to Haile's transformation, attempting to repeat the algorithmic modified motifs and to transform them further. As part of this reciprocal interaction Haile listens to players' transformations, repeats the rhythm and generates a new algorithmic transformations that can be picked up by the next player. This interaction allows for real-time variation of familiar rhythms, using stochastic and perceptual transformations that are not likely to be performed by humans. After a short bridge (featuring Synchronized Sequencing mode where humans play a short pre-programmed sequence in unison with Haile) the piece progresses into synchronous interactions. Here, Haile randomly chooses one of the recorded rhythms that it recorded before and plays it in a loop using a new tempo, while the human players slowly join in to play and improvise together. In this section Haile introduces the Perceptual Accompaniment mode, in which it modifies its drumming in real-time based on the perceived density and loudness of human playing. The section ends with a Synchronized Sequencing bridge and a tempo change. In the following section, Haile introduces a modified motif that was created off-line using the Algorithmic Morphing mode, combining elements from two or more of the motifs that were recorded earlier. Here too, Haile features the Perceptual Accompaniment mode. After a short unison bridge the Beat Detection mode is introduced, where one of the human players plays in a new tempo and beat, which Haile attempts to detect. The rest of the ensemble then joins in to provide accompaniment, as Haile continues to listen only to the human soloist, forming metrical call-andresponse duo, while the other players continues to provide the accompaniment beat. The piece ends with a solo performance, where Haile uses rhythmic materials from the first free section, showcasing its mechanical abilities. 6 Future Work In future work we plan to improve Haile's mechanical, perceptual, and interaction modules. Mechanically, we intend to broaden Haile's timbral range through experimentation with drumhead damping and new hitter designs. Perceptually, we would like to better differentiate between drum strokes, possibly using neural network and machine learning approaches. We also plan to conduct user studies in an effort to improve on and extend Haile's interaction modules. In the long term we hope to expand into melodic and harmonic human-computer interactions. References Baginsky, N. A. 2004. The Three Sirens: A Self Learning Robotic Rock Band. Cope, D. 1996. Experiments in Music Intelligence Madison WI, AR Editions. Cycling74 2005. Max/MSP. Dannenberg, R. B., B. Brown, et al. 2005. McBlare: A Robotic Bagpipe Player. In Proceedings of International Conference on New Interface for Musical Expression, Vancouver, Canada. Desain, P. and H. J. Honing 2002. Rhythmic stability as explanation of category size. In Proceedings of of the International Conference on Music Perception and Cognition Sydney. Foote, J. and S. Uchihashi 2001. The Beat Spectrum: a new approach to rhythmic analysis. In Proceedings of International Conference on Multimedia and Expo. G6cmez, B. 2001. Darbuka Method - Advanced darbuka technique. Pacific, MO, Mel Bay Publications. Lewis, G. 2000. "Too Many Notes: Computers, Complexity and Culture in Voyager." Leonardo Music Journal 10 33-39. MakingThings 2005. Teleo. Pachet, F. 2002. The continuator: Musical interaction with style. In Proceedings of International Computer Music Conference, G6teborg, Sweden. Paulus, J. and A. Kalpuri 2002. Measuring the Similarity of Rhythmic Patterns. In Proceedings of International Conference on Music Information Retrieval, Paris, France. Rowe, R. 1992. Interactive Music Systems: Machine Listening and Composing. Cambridge, MA, MIT Press. Singer, E., J. Feddersen, et al. 2004. LEMUR's Musical Robots. In Proceedings of International Conference on New Interfaces for Musical Expression, Hamamatsu, Japan. Takanishi, A. and M. Maeda 1998. Development of Anthropomorphic Flutist Robot WF-3RIV. In Proceedings of Proceedings of International Computer Music Conference. Toyota 2004. Trumpet Robot. Weinberg, G. 2005. "Interconnected Musical Networks - Toward a Theoretical Framework." Computer Music Journal 29(2): 23 -39. Weinberg, G., Driscoll S., et al. 2005. Musical Interactions with a Perceptual Robotic Percussionist. In Proceedings of IEEE International Workshop on Robot and Human Interactive Communication Nashville, TN. Winkler, T. 2001. Composing Interactive Music: Techniques and Ideas Using Max. Cambridge, MA, MIT Press. 467