Page  00000001 SOFTWARE AGENTS AND CREATING MUSIC/SOUND ART: FRAMES, DIRECTIONS, AND WHERE TO FROM HERE? Ian Whalley University of Waikato Department of Music ABSTRACT Practitioners have begun to use software agents to compose music and sound art in linear and non-linear idioms: 'non-linear' being defined here as demonstrated when a composer does not set the form/structure and/or content of a work, so there are varying realisations of it. Agent technology is first introduced, then a conceptual framework for its use in composition given. Recent creative work using the technology is summarized, and possible directions for continuing research put forward. The conclusion suggests a hybrid approach based on non-linear generative improvisation, coupled with a conversational model of human/computer interaction, taking into account music/sound as an affective language. 1. INTRODUCTION Early adoption of technological innovation is a part of new music research: the emergence of genetic algorithms and fuzzy logic, for example, have led practitioners to experiment and see where these tools might fit to expand techniques of artistic expression. As part of this process, trial and error often leads to dead ends until a technically efficient and aesthetically satisfying trend emerges. A downside of the process is that technologists, theorists and artists often look to see where their area of knowledge might be applied, rather than seek a solution that may only come from multiple perspectives. Moreover, composers occasionally get lost in learning technology at the expense of artistic output, and technologists often build systems that are intellectually absorbing but of limited aesthetic value. Software based agent technology presents a new approach to modelling non-linear events: for example, where structure is created by the dynamic interplay of the behaviour of participants, rather than being predetermined by scores or set forms. Composers have begun to explore agent technology as part of an interest in 'interactive' systems, along with complementary techniques from evolutionary art [2] and A-Life [14]. A part of the new technology discovery process is the opportunity to reflect on the balance between technical input and creative outputs, and to raise issues that might be useful in developing further work: to ask 'where are we up to and where are we going'. For example, see Paine [17] on interactivity and Weinberg [21] on internet based systems. Hopefully, these brief steps aside provide the opportunity to synthesize various perspectives, or at least provide a structure for healthy debate as part of the discovery process. 1. SOFTWARE AGENTS An agent is somebody or something that acts on behalf of another. Software agents may exhibit varying degrees of persistence, independence, communication and collaboration with other software agents or people. 'Intelligent' agents might include decision-making capabilities, the capacity to learn in an environment, and mobility over networks. More 'intelligent' agents monitor environments, glean information, make decisions to react or not, and modify their behaviour according to the results received [1] [24]. Various IDE's for developing agent system are available, including IBM's Aglets, or Grasshopper, Jade and Zeus: see Dang & Nguyen [6] and Detlor & Serenko [7]. The technology is widely applied academically in a number of disciplines (see AAMAS annual conference) and is increasingly being used in real-time industrial applications. Useful introductory texts include Bigus & Bigus [1] and Weiss [22]. Three main approaches to applying the technology are represented in Multi-Agent Systems (MAS), Distributed Artificial Intelligence (DAI), and MABS (Multi-Agent Based Simulations), although there is debate about the extent to which MABS use agents [9]. 1. FRAME Four approaches to using software agents in music/sound art seem apparent, based on the technical/artistic use of similar techniques. Agent technology is well suited to simulating linear acoustic tonal music creation/realisation, because it deals with real-time data and multi-causal situations where the parts ('players') constantly adapt to each other in the process of making a structural whole. Although a considerable technical challenge, there are obvious commercial and educative applications. The linear principle/approach could also be partially used to make acousmatic music or sound art [26] where the structure and/or form might be predetermined but the content be decided by the agents. A simulation approach then involves an agent-based system replicating known styles, without human interaction. In contrast, a generative approach [8] [14] applied to either pitch/duration or sound art composition, allows agent technology to create content and even structure based on the dynamic interplay of parts. This is a nonlinear perspective where the composer creates a range of agent behaviours rather than predetermining form, and there will be many possible outcomes and durations of works that result. Process then becomes the artistic focus, rather than a set product.

Page  00000002 A performance/reactive approach is represented in the application of MAX/MSP to live performance [18] [25], MAX/MSP's deployment in many human/machine interactive installations, or other software based systems used to create various types of net-based music with human input [21]. The term 'reactive' is used here to satisfy Paine's [17] concept of 'interaction' (see below). A generative/improvisation approach (see Brown [2] on evolutionary systems) allows improvised human input, but rather than the machine agent simply reacting, there is a human and machine adaptation to the generative input of the machine and the input of human agency. If the adaptive relationship between human and machine demonstrates balanced listening/dialogue, and little of final output is prescribed, a conversational model [17] of interaction is possible. This contrasts with current performance models of machine/human interaction where the machine largely reacts to human input and cognition/listening involves human rather than machine agency, or a balance of the two. The approach a composer adopts to creating agent based works is likely to depend on their artistic view of music/sound art, the purposes (audience) for which the output is intended, and the accumulated knowledge, techniques and conventions of one's artistic or technical training. Consciously or unconsciously, practitioners will make decisions based on a matrix, selecting from a continuum of: the language of pitch/duration music, or sound art; the affective outcomes of sound/music, or treating sound/music as having little or no affective dimension; a view of the relationship between people and machine; and between a linear predetermined structure with prescribed behaviour, or a non-linear system that evolves a structure out of the interplay of parts; and between starting from tools and looking for an application, or beginning from a philosophical/artistic perspective, and seeking a technological solution to implement it. By way of illustration, a composer taking a non-linear view of the world and sound art, rather than an instrumental view of music with linear structures, would result in a very different outcome than that of a technologist who focused on using multi-agent technology to replicate tonal music structures. While each perspective has its place and 'audience', the hope is that technical and artistic perspectives might occasionally meet to produce both good art and good computer science. 1. DIRECTIONS Considering the range and amount of academic work in multi-agent systems, it is surprising how little attention the tools have attracted from composers or computer scientists interested in music; what is not surprising is the divergence of approaches. This brief survey looks at the technical/aesthetic push and balance in emerging work. A simulation approach to tonal music, a common area of MAS research to test out scenarios in software, is found in Nakayama, Wulfhorst & Vicari's [16] 'A Multi-agent Approach for Musical Interactive System'. This describes a community of agents that 'interact through musical events (MIDI), simulating the behaviours of a musical group'. The outcome is an intelligent accompaniment system, where agents 'listen' and interact with each other. Here, agents are ascribed a basic level of knowledge to play their instruments synchronously and satisfy their internal goals. The resulting Virtual Musician Multi-Agent System is analogous to instructing a beginning jazz band to accompany a singer based on an agreed structure and set of rules, into which individuals make musical choices as they interact with other players. While technically interesting, the work has limitations artistically: first by its narrow MIDI implemenation (expressive limitation), and secondly the structure is based on a logical/rule based approach to music as language (see GTTM and Camurri's [3] classification), in contrast to a view that music is a medium for expression or a language that can be expanded as part of the creative process. Extending the simulation approach is Gang et al. [10]. Here, a system was developed for generating two-part counterpoint integrating sub-symbolic (machine learning, states/connections) and symbolic music processing (see Cope [5] on recombinicity). The approach encodes musical knowledge, intuitions, and aesthetic taste into different modules, captured by applying rules, fuzzy concepts, and learning. Their work illustrates a hybrid approach made up of a connectionist module and an agent-based module. Again, the work is of technical interest, but with all due respect to postmodernism, questions will always remain about the relative aesthetic merits of stylistic variation as good art. A largely generative systems approach is seen in the A-Life/evolutionary system informed 'Khorwa: a musical experience with autonomous agents' by Malt [12]. This installation also provides the opportunity to enter voice recordings. It is programmed in MAX/MSP rather than a common package for the development of agent systems. However, the software used for implementation limits agent applications to and extent, and Drogoul [9] might debate if there are any agents involved. At least, the system represents the integration of both a recent aesthetic and recent technology. It is difficult to know where 'Mobile Agent Systems' represented in by Kon and Ueda's work [11I] might best fit at present, because their platform remains largely aesthetically speculative. Mobile agents, in contrast to MAS systems that most practitioners implement in music/sound art, are active autonomous objects that can execute computation in a computer network, migrating from one node to another. Kon and Ueda's Andante is a prototype for the 'construction of distributed application for music composition and performance based on mobile musical agents'. Implemented in Java, two recent applications are Noise Weaver and Maestro. Both are based on agents that generate and play stochastic music in real-time, either being controlled by a GUI or script. Kon and Ueda's infrastructure provides a shell for using DAI technology, but the current limitation is one of computational formalism (i.e. aiming to sonify the results of a successful technological system), and lacks significant artistic uptake. Kon and Ueda's idea of moving away from using MIDI, and adapting the

Page  00000003 system for use by non-musicians seem well worthwhile pursuing. A Performance based reactive approach includes Spicer, Tan & Tan's [19] 'The Learning Agent Based Interactive Performance System' that applies the idea of an artificial performer to aid managing a complex system. Using a tonal MIDI/pitch duration paradigm, musical agents communicate and collaborate to produce compositions, often in conjunction with human input where the agents make up an ensemble of virtual performers. The approach allows a human performer to 'play the system like an instrument'. The performer can alter target values so that the program will change to arrive at the user's desired state; and the system incorporates a delta-learning rule that allows agents to respond and adapt to user input. The machine agent is based on an expert system model. This system is technically advanced, but the language/structure used (pitch/duration tonal music generation) limits the output to a small range of music/sound art possibilities. Recently, Spicer [20] has extended this performance approach in 'AALIVENET: An agent based distributed interactive composition environment'. This allows human performers in different locations to interact with each other via agents that they control. The system uses a client/server autonomous agent structure, client machines generating local variations of music based on updated information from human performers. The server used acts as an information hub that passes on performers' intentions to client machines. The client machines will converge on a state that reflects all performers' intentions, but the music made at each site will differ in detail. This structure is technologically of interest, and the output of educative value in teaching tonal music, but the results aesthetically restricted both by the musical language used and by the limited application of MIDI/synthesis playback used. Two recent works have explored the generative/improvisation approach, and lean heavily on a sonic art perspective. Both use improvised human input and an embedded multi-agent system that adapts in an ongoing dialectic, as either net based or a stand alone installation. Chen and Kiss [4] used a multi-agent system as part of an installation in Quorum Sensing. This work metaphorically reconstitutes an ecosystem by means of synthesized sounds and images. It makes use of a multi-agent system that responds to the input of visitors to the site, the source images and sounds containing objects that can be perceived, created, destroyed and modified by the agents. Whalley also uses an embedded multi-agent system in the PIWeCS project [24]: a Public Interactive Webbased Composition System. This is an internet based system that increases the sense of dialogue between human and machine agency through integrating intelligent agent programming with MAX/MSP. Human input is through a web interface. The 'conversation' is initiated and continued by participants through arrangements and composition based on short performed samples, and the system allows the extension of a composition through the electroacoustic manipulation of source material by human or machine. A limitation of the method is in using predetermined material, rather than letting the machine decide on some of the content (see Weinbergs' shaper approach [21]). Both of these generative/improvisation approaches enhances machine agency, attempting to add a sense of machine cognition to interactive works. However, the degree of autonomy of the agent systems and extent of interaction, differs between them. Chen and Kiss [4] allow greater autonomy over agent decision making regarding content, but their system has no memory to allow an evolving dialogue. Whalley's [24] MAS is built on an expert system, and the memory function allows for the machine accumulation of user patterns, facilitating a conversation that is unique to each session, and user/machine dialogue to develop by mutual understanding over time (see Paine's [17] conversational model). 1. WHERE TO FROM HERE In recent research, agent technology has not been applied to the affective output of pitch/duration music's dynamic emotional codes as the basis for developing a composition system (see Whalley [23], or Nakamura et. al.'s [15] CAUI system). A result is that a formalist proposition of music as language dominates pitch/duration agent applications. Moreover, a balanced relationship between creative and innovative approaches to the use of pitch/duration language (see Milicevic [13]) remains to be explored. In the sonic art domain, Paine's 2002 tabula rasa based notion of a conversational model of human/machine interaction, realised through Wishart's [26] perspective, remains aesthetically speculative because it has yet to be demonstrated in artistic work. Besides, we all bring something to a conversation and usually proceed on the basis of a common means of communication or have something to talk about. Paine's conversational model then requires tempering with some sense of a shared language/meaning, even if forms or structures are not to be predetermined. I began by outlining a matrix of choices to be made when deciding how agent technology is to be applied to music/soundart. Any decision will of course reflect one's political, intellectual or cultural interests; or the limits of one's knowledge. However, the evolution of music historically has involved an ongoing dialectic between technological innovations, the desire for human expression, and collective cultural adaptation - as the advent of the piano, electric guitar or over-driven amplifier illustrate. Perhaps a sense of this provides clues for future directions. The most success approaches artistically and technically to date seems to be in generative/improvisation. These allow an artistic response to recent philosophical shifts in the western world-view, from a deterministic/mechanistic one to a multi-causal self-organising/organic one. What is lacking in the application of agent technology to music/sound, is a means to balance the interests of a conversational model of human computer interaction with a model of music/sound as a language that communicates affectively; and a common platform for the distribution of works, such as interactive gaming that may allow ideas to be tested and integrated with

Page  00000004 wider communities outside the academy or media art circles. A hybrid system based on generative/improvisation that includes an affective conversational model presents opportunities for future multi-disciplinary work, given that agent technology is only beginning to be explored. 6. REFERENCES [1] Bigus, J.P., Bigus, J. 2001. Constructing Intelligent Agents In Java (Second Edition). John Wiley & Sons. [2 ]Brown, A. 2002. "Opportunities for Evolutionary Music Composition." Proceeding of Australasian Computer Music Association Conference, Melbourne: 27-34. [3] Camurri, A. 1993. "Applications of Artificial Intelligence Methodologies and Tools for Music Description and Processing". In Haus, G. (Ed) Music Processing. A-R Editions: 233 -266. [4] Chen, C., Kiss, J. 2003. "Setting Up a SelfOrganized Multi-Agent System for the Creation Of Sound and Visual Virtual Environments within the Framework of a Collective Interactivity". International Computer Music Conference, Singapore: 11-14. [5] Cope, D. 1991. Computers And Musical Style. A-R Editions. [6] Dang T.T., Nguyen T.G. 2002. Agent Platforms Evaluation and Comparison. Pellucid 5FP IST-2001-34519. [7] Detlor B., Serenko, A. 2002. "Agent Toolkits: A General Overview of the Market and an Assessment of the Instructor Satisfaction with Utilizing Toolkits in the Classroom". School Of Business, McMaster University, Ontario. Working Paper #455. [8] Dorin, A. 2001. "Generative Processes and the Electronic Arts". Organised Sound 6(1): 47-53. [9] Drogoul, A., Meurisse, T. Vanbergue, D. 2002. "Multi-Agent Based Simulation: Where are the Agents?". Proceedings of MABS'02 (Multi-Agent Based Simulation) Bologna, Italy, LNCS, Springer-Verlag [10] Gang, D., Goldman, C., Rosenschein, J. 1999. "NegNet: A Connectionist-Agent Integrated System For Representing Musical Knowledge". Annals of Mathematics and Artificial Intelligence 25: 69-90 [I1 on, F., Ueda, L. 2004. "Andante: Composition and Performance With Mobile Musical Agents". International Computer Music Conference. Miami, Florida 1-6 November: 604-607. [12]VMalt, M. 2004. "Khorwa: A Musical Experience with Autonomous Agents". International Computer Music Conference. Miami, Florida 1-6 November: 59-62. [13]Milicevic, M. 1998. "Deconstructing Musical Structure". Organised Sound, 3(1): 27-34. [14]Miranda, E.R. 2003. "A-Life and Musical Composition. A Brief Survey". IX Brazilian Symposium on Computer Music. Campinas, Brazil. [15]Nakamura, K., Numao, M., Takagi, S. 2002. "CAUI Demonstration - Composing Music Based on Human Feelings". Proceedings of American Association for Artificial Intelligence. www.aaai.org [16] Nakayama, L., Wulfhorst, R., Vicari, R. 2003. "A Multi-Agent Approach for Musical Interactive System". AAMAS'03, Australia: 584-591. [17] Paine, G. 2002. "Interactivity, Where to From Here?". Organised Sound 7(3): 295-304. [18]Rowe, R. 1994. Interactive Music Systems: Machine Listening and Composing. London: MIT Press. [19]Spicer, M. 2004. "AALIVENET: An Agent Based Distributed Interactive Composition Environment". International Computer Music Conference. Miami, Florida 1-6 November: 211-204. [20] Spicer, M., Tan, B.T.G., Tan, C. 2003. "The Learning Agent Based Interactive Performance System". International Computer Music Association Conference, Singapore: 95-99. [21]Weinberg, G. 2002. "The Aesthetics, History, and Future Challenges of Interconnected Music Networks." International Computer Music Association Conference, G6teborg Sweden: 349-356. [22]Weiss, G. (Ed.) 2000. Multi-Agent Systems, a Modern Approach to Distributed Artificial Intelligence. MIT Press. [23]Whalley, I. 2002. "Kansei in Non-linear Automated Composition. Semiotic Structure and Recombinicity". 11th International Symposium on Electronic Arts. Nagoya, Japan 27-31 October: 62-65. [24] Whalley, I. 2004. "Adding Machine Cognition to a Web-Based Interactive Composition". International Computer Music Conference. Miami, Florida 1-6 November: 197-200. [25]Winkler, T. 1998. Composing Interactive Music. Techniques and Ideas Using MAX. MIT Press. [26] Wishart, T. (Emmerson, S. Ed) 2002. On Sonic Art. London: Routledge.