Page  00000432 Learning to Play and Perform on Synthetic Instruments John ffitch Julian Padget Department of Computer Science, University of Bath, UK Abstract Centuries of experience with acoustic instruments means there is a body of knowledge about and understanding of what can be done with a physical instrument, and this corpus of knowledge is refined and propagated through the interactions of teachers and pupils through the ages. That experience seems to have nothing to tell us about computer-created instruments and the one-off sounds for which such instruments are often designed, nor about the instrument's sonic character, which itself is rarely explored, and certainly the intuitive phrasing and shaping applied by a human player is not even attempted. We advocate in this paper a new research direction, linking computer science and musical research, building on current activities in agent research and in performance knowledge, which could lead us to a new kind of electro-acoustic music, both in public performance and private listening. 1 Introduction Man has been creating music from before written history; we do not know about the performance of the caveman's music, but it is clear that in many cultures the professional musician has been a feature for thousands of years. This has engendered the emergence of virtuosi performers on a wide range of instruments, where skills are passed from teacher to pupil, and are gradually adapted and improved to meet changing circumstances, such as instrument variation and mechanical inventions that modify the character of the instrument. From the standpoint of a composer within the Western art tradition this means that one can assume that in writing for say a flute, that the performer will be able to reproduce the intent, except in the most extreme circumstances, without the composer having to explain everything. We can write notes in the knowledge that the performer knows the traditional notation, and will play in tune with the appropriate interpretation of loudness, crescendos and all the common effects of the instrument. Even more strongly, we can assume that the instrumentalist can and will interpret the composer's intent in the light of a long tradition. In writing for an unfamiliar instrument the performer can tell the composer what is possible, identify effects that may be what is sought, and act more as a collaborator than a dumb servant. There is nothing new or exciting in this, as we would expect nothing less. We wish to contrast this with the current state of electro-acoustic music, and in so doing show how recent technology that can be used to overcome these problems. We propose a line of research that can make the technical detail of computer-composition easier, and, also, identify a potential way in which the whole future of 'canned' music in the home, on CD or its replacement, can become in a significant way alive. 2 The Trials of the Electro Acoustic Composer 2.1 Instrument Problems While a composer may have a collection of favourite sounds, on most occasions the instruments (that is the specification of how the sound is created) are made anew for the composition, and the full character of this artificial instrument may not be understood. Furthermore, in the composition only a small range of the potential is typically used. In the early days of the use of computers in composition, the long period between writing the program and listening to the resulting sound could serve as an excuse for, or an explanation of the narrowness of the exploration. As our computers become more powerful, and at least some real-time synthesis is possible, it becomes more possible for the composer to begin that exploration. Boulanger has reported that now the sounds he created for Trapped in Convert can be heard in realtime, and he discovered a much wider range of sounds than he had supposed, and so 14 years later he wrote At Last, Free which uses the self-same instruments but in new sonic regions. But even so the time scale is insignificant compared with the years of experience of the main classical instruments, or even the more modern additions to the orchestra. In practice it is not possible for a professional composer to discover much about his created 432

Page  00000433 instrument in the time available. Often the result is imaginative music played by what might be conceived as an amateur or student band. For a particular example of how hard this process can be one may look at the Chaotic Oscillator[4]. The authors here invented a particular non-linear process which they showed was capable of a range of interesting sounds, as well as wolf-whistles and cracked notes. With significant effort they managed to produce a few examples, but the exploration was hard work, and despite the oscillator being available in Csound, it has never been released as it is too experimental for composers to use. The related non-linear filter[5] is less prone to instabilities and so is available, but is still mainly unexplored. Until some assistance is found for the exploration and understanding of instruments the electro-acoustic composer will remain hamstrung by the complexities and the detail involved in working with synthesized sounds, which interfere in the larger scale processes of working with the instruments which generate them and beyond that the composition into which they fit. We propose below (section 4.1) a program of work to try to alleviate these problems. 2.2 Score Problems In addition to the problems of learning to play the instrument, the composer has to consider elements which are obvious in the physical scenario. In particular, we identify the problems of phrasing and emphasis. Returning to our classical flute player we could mark the score with a traditional phrase mark, and the performer would almost without conscious effort play the notes as a recognizable phrase; a skilled player may not even need the marks. Without the presence of an intelligent performer, our electro-acoustic composer has to describe the low-level elements of the phrase for each note. It is known that humans create phrases by a combination of speed changes, legato notes and amplitude envelopes[3]. Research teams like that at KTH Stockholm[7] have investigated this phenomenon in great detail, and have even proposed a performance grammar (in the formal sense). What we need is a grammar and an implementation of the semantics as transformations to amplitude and timing of notes so as to perform and express on this new instrument. In writing for a small physical ensemble very little is needed by the composer to indicate in which part the tune lies, or which parts are adding textures rather than foreground events. We can call this the problem of balance between the components. Our overworked computer-composer has to adjust each note-event for balance between the parts, in addition to the existing problems of overall amplitude, as typified by ensuring that the audio material remains within the 16 bits of precision of CD/DAT technology. Constructing performance details such as phrasing and balance can be a tedious exercise and acts to deter the use of the technology. While an established composer may merely find this tiresome, for the would-be composer it is likely to dampen any enthusiasm, and precipitate a change to other activities. But if balance and phrasing are not undertaken it is all too possible that some inadvertent masking of musical material will take place, or at worst, a form of homogenization through equal weighting that makes the piece less interesting. 3 Some Computer Science For the past decade the topic of agents has been an emerging and extremely active area of computer science research in the overlap between artificial intelligence, distributed systems and software engineering[8]. The idea is to have a collection of autonomous and possibly distributed computer programs cooperating to solve a particular problem. Each agent may have particular areas of expertise, which they offer to the collection of agents whenever they see a subproblem for which they are suited. The decisions on whether to accept or reject such help rests with the other agents who are also attempting to solve the problem. A particularly revealing application of agent technology can be seen in artificial societies, and the collaboration between them and humanity[10]. It is possible to think of the creation of language as a phenomenon between agents, who wish to cooperate but are initially without shared experience on which to impose a shared language. It has been shown by Steels[11] that language can emerge from the self-organization of such artificial life. That may seem a distance from the problem posed in section 2, but this research into artificial life is exactly what is needed to fulfill our artistic needs. By creating artificial 'living' programs it may be possible to accelerate the learning of how to play the instruments, and with a little more mechanism attack the other problems. 4 The Synthetic Ensemble The solution we are advocating to both the scoring and instrumental design questions is to analyze the situa tion from the standpoint of computational Agents[12]. Important characteristics of agents are quoted as au 433

Page  00000434 tonomy, responsiveness, proactiveness and social ability: all of these have important parts to play in music making. We observe that music performance may be characterized by a stylized form of social interaction, where music replaces conversation1. Furthermore, the agents that take part in music performance make decisions about exactly how to play a note or a phrase, within the framework of a score (perhaps) and sometimes within the temporal framework defined by a conductor. Thus, in music-making we can observe instantiations of the abstract characteristics of agents listed above, and which sustain our belief in the value of exploring musical synthesis within an agent framework[2]. The autonomous agents in a musical performance are the individual instrumentalists, and possibly a conductor who is interpreting the intentions of the composer, or adding an individual stylistic variation. 4.1 The Synthetic Performer The synthetic instruments should be played by synthetic players in the guise of agents. These performance agents should explore their virtual instruments, and offer examples and feedback to the composer. This conversation needs to be at a moderately high level of abstraction, labelling sonic regions as interesting, nasty, acceptable and so forth, and the numerical parameters which capture this should not be immediately communicated to the composer. Similarly the performer agent can learn the relationship between the amplitude parameter and the output amplitude achieved. This is preparation for being able to interpret instructions to play at uniform loudness for a scale, or to adjust for balancing. In effect the agent would start the process of becoming a proficient player, or perhaps even a virtuoso. 4.2 The Conductor The conductor agent exists to mediate in the performance between the composer's intentions and the vocabulary of the performers. It is the conductor who should correct balance issues, instruct performers as to the placing of phrases, and act as a communication conduit to the composer. We view the interaction to be at a musical level; instructions like play this section staccato are greatly to be preferred to for this section reduce the note duration by 5%, unless you are the melody in which case that should be 4%, or even worse a list of detailed event-by-event numerical changes. 1To use anthropomorphic terms, but one could argue they are indistinguishable from an ontological perspective 5 The Current State This suggestion for a future vision is not just wishful thinking. We have made some progress towards this aim; here we just give a general view of this work. So far we have concentrated on phrasing. As we know roughly how human players create phrases this seems a simple starting place, but one which can also demonstrate the validity of the approach. We have an experimental version of Csound[1] which, out of real-time, reacts to the identification of phrases. It then rewrites the score to apply some simple time modulation and amplitude envelope. Currently the command format allows the composer to select from a small number of particular phrasing methods. We have only used this system so far to demonstrate the concept on traditional music. Similarly, live performers emphasize the main melodic line by playing a little louder, and playing just before the beat. Our system can imitate this, albeit in a rigid way, at present. This needs developing to be adaptive and reactive to the type of music. We are also starting a more computational experiment to construct a computer agent architecture that is capable of learning the instruments written in Csound, and so react in a high-level manner to instructions from a composer or conductor, commands like "louder", "more staccato". At the time of writing this work is only just starting and it is premature to say much about its success. We are using a standard agent software system, and initially are not contemplating realtime operation. 6 The Vision The process we have described so far has been aimed at the composer, a person for whom we have a special respect. But the underlying autonomous agent technology can be used for an even more exciting possible future. Imagine some scenes some years from now: The composeer is struggling with the new piece. Somehow the ending is not the right timbre. The player-agent suggests some variants of the sound that it can perform, and gives examples. Suddenly that is the the lost chord! All that remains is for the conductor to assist with the balancing and to limit the range of performance paramemters, and deliver the commission, receiving the money. Our music-lover decides to listen to the new musical work. He picks up the control console and directs that 434

Page  00000435 the music begins. Somehow the performance seems a little too relaxed for his mood, so he uses the console to ask the conductor agent to change the music to be more strident; within moments the music has changed as the conductor agent acts on the instructions from the communication network. The individual players who are dealing with the individual lines each changes their performance on command from the conductor. The listener considers a fine adjustment, but the conductor has captured his intent, which is not surprising as the conductor has been learning this individual's means of expression, and have been adjusting the unseen model accordingly. Clearly this is currently well beyond our capabilities, but if the music is encoded not as simple channels, but as individual instruction streams for each instrument, rather like the premix in an orchestral recording, together with general control information, the performance agents can adjust within a range set by the composer or music producer for personal preference, room acoustics, or temporal circumstances. There are a range of interesting questions in this scenario. The composers will need to give ranges of performance detail (possibly assisted by composer assistant agents), and there are human problems concerning whether the listener should be able to override the artist's intentions. But what this is suggesting is that what we currently have as 'canned' music could be a live experience, perhaps never repeating the same performance, but doing so in a manner much more determined than the arbitrary application of random numbers[6]. The system could also make concerts involving tape music more compelling, as one would hear the sound projectionist's interpretation, without the additional problems of diffusion. Most of the technology already exists, and is waiting to be used to this end. Already real-time modification of synthesis has been demonstrated [9]; we are suggesting that such systems can be controlled by agents that have learnt something of the musical world. 7 Conclusion We have advocated a program of research which applies the significant advances made in Software Agents to music, and especially electro-acoustic composition. A number of the problems which limit the development of composers can be alleviated by such an approach. We have made a small start in this work, concentrating on phrasing, but with will (and funding!) this could revolutionize not only electro-acoustic composition, but also the consumption of music within the domestic environment. This is our vision for a musical future. References [1] Richard Boulanger, editor. The Csound Book: Tutorials in Software Synthesis and Sound Design. MIT Press, 2000. [2] Richard Boulanger and John ffitch. Teaching software synthesis: from hardware to software synthesis. In KlangArt 1999 - New Music Technology, 2001. to apopear. [3] E. Clarke and W. L. Windsor. Expressive timing and dynamics in real and artificial music performances: using an algorithm as an analyitical tool. Music Perception, 15:127-152, 1997. [4] R. W. Dobson and J. P. Fitch. Experiments with chaotic oscillators. In ICMC'95: Digital Playgrounds, Banff, Canada, pages 45-48. ICMA and Banff Centre for the Arts, September 1995. [5] Richard Dobson and John ffitch. Experiments with non-linear filters; discovering excitable regions. In On the Edge, pages 405-408. ICMA, ICMA and HKUST, August 1996. [6] John ffitch. The Csound Book: Tutorials in Software Synthesis and Sound Design, chapter 16: A Look at Random Numbers, Noise and Chaos with Csound. (ed. Boulanger) MIT Press, February 2000. [7] A. Friberg and J. Sundberg. Grammars for Music Performance. KTH, 1995. [8] H. S. Nwana. Software agents: An overview. Knowledge Engineering Review, 11(3):205-244, September 1996. [9] FranCois Pachet, Oliver Delerue, and Peter Hanappe. Dynamic audio mixing. In loannis Zannos, editor, ICMC2000, pages 133-136. ICMA, August 2000. [10] Julian Padget, editor. Collaboration between Human and Artificial Societies. Number 1624 in LNAI. Springer-Verlag, 1999. [11] Luc Steels. Synthesising the origins of language and meaning using co-evolution, self-organisation and level formation. Technical report, Artificial Intelligence Laboratory, Vrije Universiteit Brussel, 1996. [12] M. Wooldridge and N. Jennings. Agent Theories, Architectures, and Languages: A Survey, volume 890 of LNAL Springer-Verlag, 1994. 435