Page  00000001 Interactive Performance with Wireless PDAs Graham McAllister, Michael Alcorn and Philip Strain Sonic Arts Research Centre, Queen's University of Belfast g.mcallister@qub.ac.uk m.alcorn@qub.ac.uk Abstract This paper describes a novel method of providing concert audiences with the ability to interact with a live musical performance. It will detail a technique which uses wireless PDAs to capture and transmit gestures from individual audience members to live performers on-stage. The paper will discuss details of the software and interface design as well as presenting a clear rationale for each component in the architecture. In addition to the technical issues, the paper will discuss the creative implications of such technology and will conclude with future work in the area. 1 Introduction One area of interactive music that remains relatively unexplored is that of creating performance environments that enable members of the audience to participate in the musicmaking processes in real-time. The empowering of audiences with the ability to shape, influence and change the musical discourse is unusual in most cultures but finds most relevance in areas where improvisation plays a significant part in performance e.g. free jazz and experimental music. Currently, the role of the audience in interactive environments is typically concerned with the physical movement of all audience members (Maynes-Aminzade, Pausch, and Seitz 2002), (Winkler 2002). Such systems are mainly used to make binary 'on / off' type decisions and therefore cannot be used for finegrained gestures. The interactive relationship between live performers and computer-based systems has also been investigated by (Jehan, Machover, and Fabio 2002), albeit, from the reverse perspective of acoustic instruments providing input to computer systems. Research has also been completed in group collaboration using wireless (Bryan-Kinns, Healey, and Thirlwell 2003), and Internet-based (Bargar et al. 1998) communications however these approaches are not aimed at audience involvement. One project which does involve both music performance and the audience is Dialtones (A Telesymphony) (Levin 2001), which dials phones in the audience at specific times to create a musical performance. In this context however, the audio being generated is not involving the audience directly and they cannot control the music being generated. This paper will detail a system which allows for detailed gestural control data to be transmitted from individual audience members to live performers. A live demonstration of the system was performed in November 2003 for BBC Television and is due to be broadcast in March 2004. 2 Rationale of System In most circumstances, audience feedback takes the form of applause at key moments (for example, the end of a work or after solos in jazz performance), or feedback in less tangible ways where musicians often have a sense of how well an audience is engaging with the performance through subtle cues in the body language of the audience. In the case of this experiment, the link is more direct and immediate. The audience member has the ability to suggest, guide and dictate the performance via real-time graphic communication with the performer. The procedure explores the basic correlation between graphic gestures and their possible interpretation as musical gestures and events. It also borrows ideas from graphic notation schemes in the 1950s and 60s (Penderecki 1959). For both the audience member and the performer the screen is a blank canvas which both parties must explore together. A complex feedback loop of graphic gestures and sound gestures builds up over the duration of the performance. Gesture, tempo, tessitura, density of events, and performance techniques are all possible features which can be relayed in the 2-dimensional graphic space. 3 System Design Before finalizing the design of the system architecture and hardware platforms, several key constraints were imposed on the device used by the audience participants. Firstly, the de Proceedings ICMC 2004

Page  00000002 vice must be small and portable enough to be used where they are currently seated. Secondly, the device must be capable of eliciting gestural data via an intuitive interface, i.e. the user should not require special training in order to use the device. Finally, the device must be capable of either Bluetooth or Wi-Fi wireless communications, infra-red was not considered suitable as it requires an unobstructed line-of-sight between the device and the receiver. Once the data has been gathered from the audience participant, it will be transmitted (ultimately) to a display where it can be visualized and interpreted by a live performer. The following sections will detail how this was designed and implemented. 3.1 Architecture Based on the constraints for the gestural device, it was decided that HP iPAQ 5450 PDAs would be most appropriate. This PDA has a large 16-bit colour display which is capable of touch (but not pressure) control, and has both Bluetooth and Wi-Fi 802.11b wireless communications built-in. The use of iPAQs in interactive environments has also been used by researchers (Fulp and Fulp 2002) working in the education sector. In order to achieve the conceptual link between the audience and the live performers, Wi-Fi 802.11b was considered more appropriate than Bluetooth for several reasons. Firstly, Wi-Fi uses the standard protocol of TCP/IP for data transmission, therefore requiring no special software development kits. Secondly, Wi-Fi has greater bandwidth available for transmission of data than Bluetooth (802.1 Ib allows for transmission rates of up to 11Mbps). This would allow for more complex data to be transmitted from the iPAQ to the performer's display. It was also decided that the number of simultaneous audience participants should be extensible, however, a minimum of 4 were required for this particular performance. Apple iMacs with 17" displays were used to provide visualization of the gestural data. These not only provided a large display area for the visualization, but also allowed the display position to be easily moved due to the pivotal arm. This latter feature was essential as various performers have different viewing constraints (drummers, saxophonists etc). To provide the communication between the end-user interfaces, a server was located between these two components. The main function of the server is to accept the data streams from the iPAQ PDAs, process the commands, and forward the resulting data in an understandable format to the associated performers iMac display. The system architecture can be seen in Figure 1. Performer Displays (aert c.eife r en:ftnAjr a~& Wireless Airport Audience Gestural Devices Figure 1: System architecture components. 3.2 Software With the task of hardware selection and architecture complete, the more complex task of software options was evaluated. This was a non-trivial task as software development options had to be chosen for the (1)iPAQ, (2)server and (3) performer display, as well as the design and implementation of a data structure to enable effective communications between all three layers. The following sections describe the software design at each tier of the architecture. iPAQ. The key requirements for the iPAQ development language were that it should be capable of providing TCP/IP communications and more importantly, be able to display a novel GUI. Examining the software development options available for the iPAQ, the main options are limited to eVC++, eVB or Java. On further examination is was decided that Macromedia Flash MX could provide the TCP/IP communications as well as offer more versatile user interface design options. As this software was considered to be a preliminary prototype of an ongoing extensible project, it was deemed appropriate to keep the data that was to be transmitted to a minimum. This would allow for the overall architecture to be evaluated and ensure the minimum of latency. Therefore, the gestural information that the user could enter was in the form of a freeform sketch pad, with diminishing concentric circles Proceedings ICMC 2004

Page  00000003 indicating the direction of motion of the cursor (see Figure 2). Upon connection to the server, the iPAQ client is given a unique ID, this is used to associate a particular iPAQ to a specific iMac display. Data is transfered from iPAQ to server upon a 'cursor down' event (touching the screen), in the following XML format: <event type="0"> <position shapeID="shape0" x="< xn >"y="< yn >"></event> where xy and yn are integers representing the current spatial cursor position. Server The key requirement for the server software was that it should be platform independent, thus making Java or Flash the evident options. Suitable software was available by means of Fortress (Fortress 2004) and Flosc (Flosc 2004), however it was decided that in order to have control at the source code level, a custom developed Java server would be required. The primary function of the server is to manage the socket connections between the iPAQs and each corresponding iMac. This is achieved by storing the IP address of both the connecting iPAQ and iMac, and ensuring that the data stream coming in on the iPAQ socket, is relayed to the iMac socket. The existence of the central server also allows for future extensibility by processing the data stream as it is transmitted from audience to performer, i.e. essentially modifying the data along its transmission path. Performer Display The primary requirement for the performer's display software, was that it should mirror the same gestures created by the associated audience member, hence, Macromedia Flash MX was again used for the visualization. In order to display the gestures, the software receives the XML string from the server (which in turn was acquired from the iPAQ) and renders the original gesture on the iMac display. The fundamental difference between the transmitted and the received gesture, is the difference in spatial resolution, the iPAQ has a screen resolution in 320x240, the iMac is 1440x900. In order to ensure accurate scaling of the gestures, the software on the iMac must apply a scaling transform on the received spatial co-ordinates and also apply interpolation of the received cursor positions to smooth the gestural animation. Figure 2 shows the received gestural visualization and a possible musical interpretation (inset). 4 In Performance Before the concert, the audience members were given an overview on the use of the iPAQ, essentially being told that their gestures would be mirrored to a specific performer's Figure 2: Gestural Visualization iMac display. In use, the audience members did not have difficulty in understanding or using the designed graphical user interface, however the performers reported that the visualization would only remain on the screen for several seconds. The visualization was initially designed to display the gestured data for 5 seconds before fading, allowing for a dynamic relationship. However, the performers would have preferred a longer trace from the visualization allowing them the potential to create an overall form to their musical interpretation. The system was tested on an improvisation ensemble comprising the following instruments: drums, string bass, guitar and live electronics, and bass clarinet and live electronics. These four players were very experienced performers with considerable skill in improvisation and in extended instrumental techniques. The audience performers were chosen at random and none of them had prior experience of improvising with experimental musicians. Each of the performances began with the audience performers sketching on the PDAs. Within a short time period some correlation emerged between the graphic gestures and the physical location on the PDA screen and the sounds being performed by the musicians. This process, a closed loop between the musician and the audience performer, became a training procedure whereby each party started to understand the meaning of both visual and sound gestures. From this, a two-way vocabulary emerged which enabled both parties to communicate gesture, tempo, density, pitch and duration. Based on the feedback received, the system is currently being re-designed to allow for both an overall structure and dynamic gestures (see Further Work). Users also found the system to be responsive, reporting that they received an instantaneous musical response from their associated performer as a result of gesturing (the measured latency was around 2ms). Interestingly all PDA performers noticed that the two-way Proceedings ICMC 2004

Page  00000004 communication with their musician gave way to a more significant performance experience, namely that they were jamming with other people in the audience via the performers on stage. This triangulation of the performance experience was one of the most fascinating and musically unique outcomes from the whole experiment. As such this has made possible methods for enabling audience involvement in performance and for helping to shape the shared musical experience. Obviously this will not work in all musical environments, but it is certainly opens out new possibilities in improvisation and in open-ended, installation-type performances. 5 Conclusions and Further Work The system detailed in this paper was devised as a concept prototype, however further work is currently underway to extend both the initial system and the user experience offered. In particular, the expanded system allows the function of viewing a sketch of the overall performance (indicated by an envelope gestured by the audience user) in addition to the dynamic gestures that are transmitted and interpreted in realtime. This feature indicates to all involved the duration of the performance and will allow all involved to structure their future gestures or musical interpretations. Figure 3 shows the preliminary user interface for the extended system. the lower half of Figure 3, i.e. the performer may interpret the envelope as volume or pitch. There is also scope to use this system in the electronic music domain, whereby any member of the audience could contribute control and gestural information to a performance created through synthesis and signal processing. This unique experiment has enabled new modes of audience interaction with music performance. To further reinforce this experience, we plan to use a data projector in order to present the graphic and gestural information to audience members. References Bargar, R., S. Church, A. Fukuda, and J. Grunke (1998). Networking audio and music using internet2 and next-generation internet capabilities. In AESWP]001 Technology Report TCNAS 98/1. AES. Bryan-Kinns, N., P. G. T. Healey, and M. Thirlwell (2003). Graphical representations for group musicimprovisation. In Second International Workshop on Interactive Graphical Communication. IGC. Flosc (2004). Flash open sound control. In www.benchun.net/flosc/. Fortress (2004). Fortress interactive entertainment platform. In www.xadra.com. Fulp, C. D. and E. W. Fulp (2002). A wireless handheld system for interactive multimedia-enhanced instruction. In IEEE Proc. International Conference on Multlimodal Interfaces. IEEE. Jehan, T., T. Machover, and M. Fabio (2002). Sparkler: An audiodriven interactive live computer performance for symphony orchestra. In 32nd ASEE/IEEE Frontiers in Education Conference, pp. 1-6. IEEE. Levin, G. (2001). Dialtones (a telesymphony). In http://www.flong.com/telesymphony. Maynes-Aminzade, D., R. Pausch, and S. Seitz (2002). Techniques for interactive audience participation. In IEEE Proc. International Conference on Multlimodal Interfaces, pp. 1 -6. IEEE. Penderecki, K. (1959). Threnody (To The Victims of Hiroshima). London: Chester Music and Novello and Co. Winkler, T. (2002). Audience participation and response in movement-sensing installations. In International Computer Music Conference. ICMC. Figure 3: Future Concept Prototype The upper half of the screenshot in Figure 3 is constantly in motion (right to left), thus the user drops symbols onto the page and they are scrolled towards the vertical dotted line along the left-hand edge. Upon reaching the vertical line, the performer will interpret the symbol according to varying criteria such as the symbol type, it's position on the screen and how fast it was moving. The performer will also need to consider the current status of the envelope indicator (located at Proceedings ICMC 2004