Page  00000001 ROBOSER - AN AUTONOMOUS INTERACTIVE MUSICAL COMPOSITION SYSTEM Klaus C. Wassermann, Mark Blanchard, Ulysses Bernardet, J6natas Manzolli2, Paul F.M.J. Verschurel ETH and University of Zurich Institute of Neuroinformatics Winterthurerstrasse 190, CH-8057 Zurich, Switzerland klausw@, jmb@, ulysses@, 2 Interdisciplinary Nucleus for Studies on Sound Communication Rua da Reitoria, 165, 13036-300 University of Campinas, Brazil jonatas ABSTRACT This paper describes Roboser (, an autonomous interactive music composition system. The core of the system comprises two components: a program for simulating large-scale neural networks, and an algorithmic composition system. Both components operate in real-time. Data from e.g. cameras, microphones and pressure sensors enter the simulated neural system, which is also used to actively control motor devices such as pantilt cameras and robots. The neural system relays data representing its current operational state on to the algorithmic composition system. The composition system in turn generates musical expressions of these neural states within an a priori stylistic framework. The result is a real-time system controlled by a brain-like structure that behaves and interacts within a given environment and expresses its internal states through music. INTRODUCTION Algorithmic composition systems have evolved side by side with the development of Western music (Loy, 1988; Rowe, 1993). Ada Lady Lovelace, who is considered one of the first computer scientists (Kurzweil, 1992), envisioned how the analytic engine developed by Charles Babbage could be applied to musical composition as early as the beginning of the 19th century. However, it was not possible to explore the power of algorithmic composition systems on a broad basis until the advent of modern computers (Roads, 1996). At present, two main trends can be observed in the development and application of novel technology to generate and compose music. One is the shift from the notion of computer assisted composition to that of interactive music systems (Krefeld, 1990; Ryan, 1991; Rowe, 1993; Paradiso, 1997; Rokeby, 1998; Paradiso, 1999). In these approaches the emphasis lies on using novel technology to create more intuitively accessible human-machine-interfaces to physical or virtual musical instruments. A complementary approach aims at developing more advanced systems for the specification and generation of sound material. (Dechelle et al., 1998; Wanderley et al., 1998). The goals of this approach move in the direction of synthesis of sonic events, the transformation of sound material produced by human musicians, and its integration with computer generated material. The Roboser project presented here represents an approach in the realm of autonomous music systems. Many of the recent approaches to algorithmic composition systems use random motion as source for variation in their musical output. The goal of this project is to move away from random input by using a real-world behaving device as the input source to an algorithmic composition system. It should compose and perform novel and appealing music in real-time, using sequential input that is generated by real-world behaviour. Roboser allows both sensory input and internal states of the control system to be expressed in an algorithmic composition process. It is controlled by a simulated nervous system and autonomously acts in and interacts with the world around it to produce musical compositions. These compositions express the way Roboser experiences its world. SYSTEM ARCHITECTURE The heart of Roboser comprises two software components (figure 1): the distributed real-time neural simulation environment IQR421 (Verschure, 1997), and CurvaSom, an algorithmic musical composition system,

Page  00000002 Sound Figure 1: Scheme of Roboser' s architecture. which generates MIDI events in real-time (Manzolli & Maia, 1998). IQR421: Roboser's brain Roboser's central control system is implemented as a simulated nervous system in the IQR421 environment. Using IQR421 to simulate neural network control structures bears several advantages. The program provides real-time performance for processing sensory signals and motor commands in a neurally distributed way. It has a graphical user interface that facilitates programming complex neural architectures. Using the TCP/IP protocol it is possible to distribute the computational load of neuronal subprocesses over a network of standard PCs. By utilizing user-defined modules one can connect a large variety of input (e.g. CCD cameras, microphones, and pressure sensors) and output devices (e.g. video graphics, pan-tilt units, and robots) to a given neural control structure. The dynamics of neural activity are computed in real-time; a feature that is crucial for Roboser's performance and that is not available within most neuronal simulation environments currently available. CurvaSom: Roboser's voice CurvaSom aims at synthesizing a complex musical composition using a selection of predefined sound sets, taking advantage of the MIDI protocol. As opposed to a micro-level approach, where the emphasis mainly lies on synthesizing sound events out of waveform varieties, MIDI allows us to adopt a macro-level approach to musical structure and expression. In addition, using MIDI as the protocol for sonic output provides us with compact representations of sound primitives, a fact that greatly facilitates the system's real-time performance. CurvaSom is organized around a number of internal heuristics for sound generation, so-called sound functors (Manzolli and Maia, 1998), and a set of parametric sound specifications with which the system is seeded. For Roboser, CurvaSom was adapted to accept IQR421 neural input data. SONIC DOMAIN MIDI devices IQR Input Figure 2: Schematic diagram of CurvaSom. f, sound functor. The notion of sound functors (figure 2) is based on the mathematical theory of categories (Manzolli and Maia, 1998). In this approach a compositional process is divided into two components or domains: the control domain and the sonic domain. These domains are defined as mathematical representations of acoustic phenomena. The control domain is a parametric set, containing subsets specifying the numerical control of sound features such as tempo, scale or key. The sonic domain contains the set of specific sound features such as instrument, pitch, velocity, volume or pan, which can be controlled by the parameters of the control domain. The compositional process consists of translations from the control domain to

Page  00000003 the sonic domain, called morphisms (Manzolli and Maia, 1998). THE DEMO The demonstration of Roboser includes the small mobile robot Khepera (K-team, Lausanne, Switzerland). It is equipped with eight infrared (IR) send-receive sensors that can also passively sense ambient light, two driven wheels and a small color-CCD camera. neural representations of global states like attraction and aversion. Visitors can actively influence the performance of the system by moving small objects within the robot's arena, or by attracting the robot with small torchlights. The dynamics of the neural control structure are translated into data that is fed into the CurvaSom composition software to generate musical expressions. Thus, Roboser communicates its current internal state to the audience (table 1). DISCUSSION Using MIDI protocol provides us with a compact and computationally inexpensive way to generate sound. However, in principle, the neuronal activation states in IQR421 could also be used to synthesize sound structures by granular synthesis (Xenakis, 1960). One way to achieve this could be mapping the activation of single neurons to single sound granules. We intend to explore strategies of sound synthesis using Roboser in the future. Roboser is a flexible system that allows a variety of different setups. Since 1998 we displayed the system to the general audience in various configurations. At the computer technology fair ORBIT '98 (Basel, Switzerland) we had a setup similar to the one described above. Visitors watched the robot moving and tried to attract the system's attention by shining light at the robot or making it avoid obstacles. The effect of the robot's behaviour upon the music seemed to motivate people to continue playing with the system. Later we also used completely different setups, including cameras mounted on the ceiling of dance halls, to produce interactive dance music (FRED '99, Zurich, Switzerland,; CYBORG FRICTIONS '99, Bern, Switzerland, cyborgfrictions), and an interactive dance performance in collaboration with choreographer Malika Lum (BRAIN FAIR 2000, Zurich, Switzerland, Rapid advances have been made in the technology to synthesize sonic events, and to define, influence, and algorithmically generate musical compositions. The most challenging question at present, however, is whether we can use this technology to autonomously generate musical compositions. In this context it has been argued that there is a critical barrier beyond which machines cannot go. For instance, Roads (Roads, 1996) proposes that the human role in the compositional process is and will remain crucial. His argument is essentially an empirical one: if machines could be constructed which would be able to display creativity and virtuosity in musical composition, they would already have been built and have replaced humans. An additional argument is based on the observation that it is still a human being that defines the algorithms that perform the composition. This last problem, however, is Figure 3: Example of a path taken by the robot strolling within the arena. The robot steers away from walls and other obstacles, whereas it stays near a light source as long as it is lit. White circles, light sources; black squares, obstacles; *, start position; R, robot. Roboser's internal states: a, exploration state; b, aversion state; c, attraction state. Table 1: Examples of musical features representing Roboser's internal states. a, exploration state; b, aversion state; c, attraction state. I musical features a simple melodies in a major scale b strong rhythmic accents, arpeggios of augmented chords c major chords, including maj.7th, 9th The robot explores a small arena, avoiding collisions with objects and walls and approaching light sources (figure 3). All robot behaviours are controlled by a simulated nervous system within the IQR421 software environment (Verschure and Voegtlin, 1998). Besides having neural representations of sensory processing stages, receptive fields and motor maps, there are also

Page  00000004 not special for the area of algorithmic composition and is haunting most of the traditional and contemporary research in artificial intelligence (Searle, 1982; Dreyfus, 1992; Verschure, 1993). It in essence deals with the problem of defining artificial systems that can develop beyond their a priori specification (Verschure and Voegtlin, 1998). Roboser represents a new approach to autonomous composition by generating musical structure out of the internal states of a real-time brain-like control system that operates in the real-world. Instead of random motion or feedforward sensory input, the variation in the musical performance is provided by the operational states of the system. In turn, these operational states depend on the sensory input as well as on the behavioural performance of the system, forming a real-world feedback loop. By taking such a perspective and a synthetic approach it might be possible to develop an alternative view on the notion of creativity in the future. REFERENCES Dechelle, F., Borghesi, R., DeCecco, M., Maggi, E., Rovan B. & Schnell, N. 1998. "jMax: a new JAVA-based editing and control system for real-time musical applications." Proceedings of ICMC: International Computer Music Conference, Ann Arbor, USA, October 1998. Dreyfus, H.L. 1992. What Computers Still Can't Do. A Critique of Artificial Reason. Cambridge MA: MIT Press. Krefeld, V. 1990. "The Hand in the Web: An interview with Michel Waisvisz." Computer Music Journal 14(2)28-33. Kurzweil, R. 1992. The Age of Intelligent Machines. Cambridge MA: MIT Press. Loy, G. 1988. "Composing with computers - a survey of some compositional formalisms and music programming languages." Computer Music Research, ed. Mathews, M.V. and J.R. Pierce. Cambridge MA: MIT Press. Manzolli, J. and Maia Jr. A. 1998. "Sound Functors Applications." In proceedings of the fifth Brazilian Symposium on Computer Music, XVII Congress of the SBC, UFMG, Belo Horizonte, Brazil, pp. 115-120. Paradiso, J. 1997. "Electronic Music Interfaces: New Ways to Play." IEEE Spectrum Magazine 34(12)18-30. Paradiso, J. 1999. "The Brain Opera Technology: New instruments and gestural sensors for musical interaction and performance." Journal of New Music Research 28(2)130 -149. Roads, C. 1996. The Computer Music Tutorial. Cambridge, MA: MIT Press. Rokeby, B. 1998. "The construction of experience: Interface as content." Published at Digital Illusions: Entertaining the future with high technology, ed. Association for Computing Machinery, Reading, Ma.: Addison-Wesley. Rowe, R. 1993. Interactive Music Systems: Machine Listening and Composing. Cambridge, MA: MIT Press. Ryan, J. 1991. "Some remarks on musical instrument design at STEIM." Contemporary Music Review 6(1)3-17. Searle, J. 1982. "Minds, brains and programs." Behavioral and Brain Sciences 3:417-424. Verschure, P. F. M. J. 1993. "Formal minds and biological brains. " IEEE expert 8(5): 66-75. Verschure, P. F. M. J. 1997. "Xmorph: A software tool for the synthesis and analysis of neural systems." Technical report Institute of Neuroinformatics, ETH-UZ. Verschure, P. F. M. J. and Voegtlin, T. 1998. "A bottom-up approach towards the acquisition retention, and expression of sequential representations: Distributed Adaptive Control III". Neural Networks 11:1531-1549. Wanderley, M. and Schnell, N. and Rovan, J. 1998. "ESCHER - Modeling and Performing Composed Instruments in realtime." Proceedings of IEEE Int. Conference on Systems, Man and Cybernetics San Diego - CA, USA, October 1998. Xenakis I. 1960. "Elements of stochastic music." Gravesaner Blatter 18:84-105.