Page  00000171 Evolutionary strategies for spontaneous man-machine interaction Peter Beyls peter@arti.vub.acbe Abstract We provide as short introduction to a realtime performance system built from genetically evolved components It aims to support sophisticated, rich forms of manmachine interaction in the musical domain. The evolutionary approach is motivated by the complexity barrier facing explicit design of complex dynamical systems. This paper reports on the architecture of a sensoreffector system controlled by an evolved neural structure. Introduction The present paper documents a long time project aiming the computer representation of a synthetic, expressive personality endowed with expertise to successfully support rich, man-machine conversations in the musical domain. We briefly describe the developmental history of the Oscar (acronym for Oscillator Artist) application since its first incarnation in 1986 [Beyls, 1988]. The current object-oriented implementation incorporates ideas from knowledge based systems as well as agents inspired models. The basic idea is that Oscar tries to express its personal character while also pursuing integration into a larger social context. This results in a problem of conflict resolution i.e. the continuous evaluation of expressive versus integrative forces. The program consists of some 40 modules including, a neural network for real-time tonal inference, a complex memory management scheme, extensive feature detection and feature tracking algorithms for analysing incoming MIDI streams. Oscar uses various generative methods for producing output including a physical model acting as a complex dynamical system and an ensemble of interacting musical agents (roughly based on Minsky's SOM theory) all having access to a large library of transformers. However, two unusual additional software components provide Oscar with exceptional flexibility. First, an exploration module which automatically searches for interesting events in long term memory (inspired by the pioneering work of Doug Lenat) and second, the possibility to evolve brains -- along the lines of current work in the discipline of artificial life. Oscar's brain is a sensor-activator network with sensors, neurons and activators configured in an arbitrary network. Motivation As we perceive it to be impossible to design brains through explicit specification -- both because their complexity is beyond human understanding and because we want to favour a computational climate aiming the discovery of useful though unknown machine interpretations -- we opt for implicit evolution as an alternative to explicit design. In practice, families of brains are evolved, evaluated and attributed a certain fitness according to how well they perform. Brains (the connections and the weighting factors) are viewed as genetic material which is manipulated using cross-over and mutation operators. The objective is to evolve artificial brains which perform well facing the unpredictable input of a human interactor. Our hope is to grow a system which emulates a personality-rich character, supports rich forms of spontaneous man-machine improvisational interaction based on shared initiative and common effort. Thus, our system does not feature a predefined fitness function but uses perceptual selection, in effect, a method to search for interesting brains without providing explicit criteria or without even ICMC Proceedings 1999 - 171 -

Page  00000172 fully understanding the internal complexity of the evolved structures. Previous and related genetic approaches In contrast, some workers have suggested a socio-psychological model for synthetic actors in terms of personality traits, moods and attitudes [Rousseau & Hays-Roth, 97], While speaking to the imagination, it proved nearly impossible to interpret their findings in the realm of complex dynamical systems.Pioneering work on the interactive evolution of dynamical systems for computer graphics and animation is documented in [Sims, 91]. The Variations system suggests a method to produce a set of filters that identify acceptablematerial from a stochastic music generator [Jacob, 95].The Flora System viewed the lookup table rules of one-dimensional cellular automata as genetic material. We devised methods to critically control degrees of complex chaotic behaviour, interpolate between points in combinatorial space and polyphonic mapping algorithms to Midi. A more recent implementation views the rewrite rules -- expressed as nested LISP functions of arbitrary complexity -- of Lindenmayer systems as genotypes. Very complex, yet coherent musical structures arise by interactively steering the system through genetic space [Beyls 97]. Architecture Oscar has a conversational attitude, it creates output -- answers or comments as it were -- to the current musical context. Its inclination shifts between individual expression and social integration. Oscar's current incarnation blurs the distinction between symbolic and sub-symbolic approaches. Some of its components are rulebased, others adhere to the agent paradigm. Oscar's ear parses the input stream and feeds memory. The brain is thought of as a sensor-actuator network [Van de Panne & Fiume, 93]. The effectors control the functionality of eight agents producing midi output. Oscar also contains a neural network for continuous tracking of the tonality of the MIDI input stream, similar to the one in Rowe's Cypher machine listening and composing system. Midi output issues from an ensemble of virtual musicians, acting as a selforganising society of locally interacting agents. Memory organisation Both inspired on conventional psychology as well as pragmatic computational arguments, Oscar uses three types of memory: short, intermediate and long term, each aiming a specific use. One record holds four items: entry delay, duration of note, pitch and velocity. One basic idea is to find a working balance between exploration and exploitation. The latter means to make full use of a given limited set of data -- as, for instance, to generate a musical answer to the analysis of these data. On the other hand, the exploratory mode means to allow for large jumps in the interaction activity -- for instance, when aiming to redirect the interactional climate substantially. Exploration is a process of discovery, in its most natural form, an unguided one since no criteria are given as to what is considered to be musically interesting, a continuous classification process builds hierarchies as to account for this. However, Oscar uses a set of explicit (albeit adaptive) rules while scanning long-term memory thus isolating time points of major deviation. The information accumulated in long term memory is explored occasionally, i.e. when Oscar gets bored with the quality of the current interaction or when intermediate memory happens to be empty. At that point, Oscar exhibits introspective behaviour. Intermediate memory is a circular buffer holding the 64 most recent input events, it is consulted heavily by most analysis tasks. It is also subject to deterioration; events fade - 172 - ICMC Proceedings 1999

Page  00000173 (and eventually disappear) at a rate proportional to their loudness, the fading process if parameter controlled. Finally, short term memory is two fold. The input parser tries to isolate individual musical sentences in the input stream, the last two sequences are always stored in short term memory. One obvious reactive application is to interpolate between them creating a musical climate shifting back and forth in time. In addition, note values are collected in a pitch-class vector subject to activation-inhibition and, input values are collected in an interval transition matrix. Both methods provide additional sources for creating an opinion of the musical nature of the human interactor and are accessible by output generating methods. tendency and angularity of recent intervals, duration of notes, density and relationship of active and passive (rests) events in memory, chordal density, diversity and regularity of any parameter in intermediate memory, similarity of the last two identified input sequences, a sensor signalling when the human interactor remains silent for a long time, harmonic tension in the input stream, beat tracker based on the detection of the regularity of velocity accents, ratio of lastgap vs. last-note, acceleration or deceleration, detection of the type of articulation of input material... and many more. The sensors also feed from a concurrent process performing statistical analysis on all parameters in a given memory: information on regularity, similarity, dynamics and contrast is available at all times. Sensors Neurons Sensors are object methods which, on a continuous basis, listen to the outside world. They fire automatically when certain features happen to be detected. Sensors are identified and designed explicitly, in other words, there is no automatic categorisation taking place. The current implementation is perceptive to about 40 features. Some exhibit built-in latency since they feed from intermediate or long-term memory. Other feature detectors are instantaneous, for instance, the reflex function aimed at detecting very large swings in context. It fires when an extreme velocity gradient occurs simultaneous to a very large interval in the pitch domain. Note that sensors are Boolean functions which usually consider data to be above or below a given (sometimes adaptive) critical threshold. Their output is collected in a binary feature-vector which can potentially be consulted by any other module in the system. We keep copies of the current and previous vector as to be able to track short term evolution of input activity. Sensors are provided for sensing change in tonality, loudness levels, pitch class, A neuron is a stimulus-response system, connected up to 5 different sensor objects, all connections are weighted (-2 to 2, excluding 0). Neurons continuously compute the weighted sum of the activity from their associated sensors. Neurons' instance variables, including fitness value, are initially filled with random values. Families of neurons are generated and interactively evaluated, attributed a fitness (at this point only considering how well the effect of different stimuli combine) and interesting ones are saved to disk. Neurons show up in Oscar's main graphic interface as simulated LEDs. Brains A brain consists of a collection of 8 neurons. Neurons are configured in an arbitrary network with both one and two-way weighted connections, represented as a matrix specifying positive (excitation) and negative (inhibition) values. The complex interplay of stimuli guarantees non-trivial mapping. Groups of brains are computed and subject to ICMC Proceedings 1999 -173 -

Page  00000174 conventional genetic operators like cross-over and mutation. We try to attribute a fitness to a given brain by facing it with various types of input material, a very subjective, experimental process indeed. First, the density and complexity of neuronal activity is studied and input-output relationships are observed. Second, 8 neurons are connected to the 8 effectors. The illustration above shows a sample evolved brain. From bottom to top: 36 sensors, connected to 8 neurons configured in a network. Dots indicate points of neural excitation. Effectors Oscar's effectors are implemented as a collection of agents living in 2D space, all agents express a certain variable social affinity towards each other in terms of preferential distances, these can be thought of as the external physics imposed on this virtual world. The society as a whole tries to minimise the global stress by moving in 2D space. Agents are equipped with local rules (these are the actual effector functions) which they execute according to their sensitivity. An agent makes up its mind by evaluating three alternative input sources. First, its own individual character. Agents hold a library of interval and other motivic material as well as private orchestral mapping algorithms to Midi plus a specific energy instance variable subject to fading and recovery. Second, any agent may consult the momentary collection of neighbouring agents and borrow musical material from one of them. Finally, memories of material provided from earlier interactions with a human interactor may be accessed. Each agent further acts of as a self-regulating system; a performer object adjusting its sensitivity to input according to the amount of information in its working memory. Agents feed modified material from their own play buffers to themselves while sensitivity is adjusted as to avoid overflow. New material overwrites or mixes with existing buffer contents. A function which flushes the buffers occasionally is also present. This guarantees for an adaptive system which features an inherent history preservation mechanism. All agents have access to a very large library of parametric transformer methods, in effect, imposed by the output of the brain. This version of Oscar was implemented in HMSL, a fine object-oriented software platform for musical experimentation developed at Mills College by Phil Burk, David Rosenboom & Larry Polansky. Acknowledgements I express my gratitude to Jeffrey Ventrella, Karl Sims, Joel Chadabe and Luc Steels for discussions and productive feedback References 1. Beyls, Peter Aesthetic Navigation: musical complexity engineering using genetic algorithms, Proceedings of the JIME Conference, Lyon France, 1997 2. Sims, Karl Interactive Evolution of Dynamical Systems, Proceedings of the First European Conference on Artificial Life, Paris 1991 3. Van de Panne, M. & Fiume, E. SensorActuator Networks, Computer Graphics Annual Conference Series 1993 4. Rousseau, D. & Hayes-Roth, B. Interacting with Personality-rich Characters Knowledge Systems Laboratory Report KSL97-07. Stanford Univ. 5. Jacob, Bruce, Composing with Genetic Algorithms, Proceedings of the ICMC 1995 - 174 - ICMC Proceedings 1999