Page  336 ï~~Questions About Interactive Computer Performance as a Resource for Music Composition and Problems of Musicality Dr. Noel Zahler, Director Center for Arts and Technology The Cummings Electronic and Digital Sound Studio (CEDS) Box 5632 Connecticut College New London, Connecticut 06320 E-mail: nbzah@conncoll.bitnet Abstract History has recorded the fascination people have had with mechanized performance for centuries (see Roads [10]), but it is only in our own century - more specifically, only in the last seven years - that we have been able to breach the technological gap which separated us from our desire for an intelligent interactive real-time computer performer. A computer performer that can follow a score, adjust its tempo and dynamics to accommodate the interpretation of a live performer, and, in some instances, contribute to the "ensemble" in ways that simulate what a live human instrumentalist does when performing. To date, at least four individual approaches to building such a computer performer have come forward. They range in scale from a tool for conducting a computer orchestra, to a full scale "synthetic performer" capable of listening to, learning from, and anticipating the live performer with which it is playing (see Boulanger [6] and [7]). But, have we really simulated the behavior of human performers during a live performance, and how do these "synthetic performers" fair when the performance experiments are taken out of the laboratory and placed in the arenawhich is live performance? More importantly, what are the demands of living composers on such new compositional resources? This article centers on my own work and work done in collaboration with my colleagues Bridget Baird and Donald Blevens in creating what we call the "Artificially Intelligent Computer Performer (AICP)." Previous publications (see Baird, Blevens, and Zahler [1], [2], [3] and [4]) have centered on the creation and workings of the program. This essay describes the problems encountered when writing music for this new breed of computer instrument. Introduction Those who have worked with technological aids have always posited a view which argues that the use of such devices should make work easier and give individuals more "leisure time." The purpose remains the same in the realm of music. Ensemble music- musicwhich requires twoor more performers to realize a composition - is an activity which demands a larger set of contingencies than most other art forms. Individuals wishing to play acomposition involving two or more instruments must consider when, and if, others with the necessary instruments to complete the ensemble are available. They must inquire as to the technical skill of the other instrumentalists and be sure that the literature to be performed is not beyond the expertise of those individuals. They must find~a mutually agreeable time and place to rehearse and perform the composition and be confident that there are no overiding interpretive orphilosophical differencesof opinion thatmightderail the performance. These are just a few of the considerations encountered when putting together a music ensemble. The problems grow geometrically as the size of the ensemble increases. It should come as no surprise that there is a real need for a nonhuman performer that is able, as a minimum requirement, to satisfy the criteria listed above. In addition, such a performer should be capable of executing any number of instrumental parts, regardless of their difficulty, it should be portable, and above all, responsive to the interpretive demands of the live instrumentalist. Earlier alternatives such as pre-recorded accompaniments were static; they did not accomodate ahuman performer's given interpretation. "Live computer music performance," wherein a human performer plays along with a sequenced accompaniment is not an alternative. Sequencers are no more than pre-recorded digital files that substitute for magnetic tape or phonograph recordings. The "live" portion of any such performance is distingished from tradition only in terms of the type of instrument employed, and once the sequence is triggered, tempo, dynamics, etc. are immutable. The interactive real-time intelligent computer performer is the first real alternative to these earlier attempts to satisfy the ensemble needs of individual performers. The AICP The AICP is asoftware program, written in the C language which holds in memory a complete musical score. That score is entered into the memory of the computer with an "off the shelf" music notation program1. The AICP listens, via midi, to a live performer, performing in real-time, and tracks that performer trying to match the performer's output to the score in its memory. Once it has a "best guess" as to where the live performer is, it responds with its own part which is designated by the score. The AICP is a Macintosh application which is presently configured to run on a Mac II or SE30 with four megabytes of RAM. While the notion of such a program is not original in concept (see Vercoe [12], Vercoe and Puckette [13], Bloch and Dannenberg [5], Dannenberg [8], and Dannenberg and Mukaino [9]) the AICP offers a number of interesting alternatives The AICP was developed to accompany a single human performer in a composition of that individual's choice. It has been designed to accommodate compositions from both the traditional literature, aswell as contemporary compositions wherein the composer has chosen to write for a human performer in combination with the sounds now made possible by technological advances in digital sound synthesis. It performs according to the instructions in the composer's score without stylistic bias. In essence, it is one solution to providing an ensemble to play traditional literature with a single live performer. It is also a "meta-performer" capable of triggering multiple events, via midi, for the contemporary composer who is compelled to combine a traditional human instrumentalist with the new capabilities 1 We presently use FinaleM by Coda Music Software, but most "type 1" and "type 0" midi files are acceptable. ICMC 336

Page  337 ï~~of computer instruments and to whom it is essential to retain the freedom of interpretation usually attributed to live human performance. In each case, the requirements remain the same for the computer performer, only the outboard requirements (synthesizer hardware and software interfaces) change. The integrity of human interpretive performance, eliminating the "music-minus-one syndrome," and the means by which human performers achieve that interpretation remain the single most important factors in our work. Modeling this knowledge base remains a massive problem. Human modes of cognition are so numerous that conceptualization of these issues requires simplification of the actions and a consequent reduction in the data to be analyzed. In the arts (in particular music performance) this scenario becomes even more, complicated. The number of tasks performed by each player, physically, as well as mentally, and the myriad of possible consequent actions and strategies, makes the problem all the more enigmatic to conceptualize as specific performance arcumstances change. Practical considerations have led us to breakdown specific areas of action into modules simulating what we believe are possible accounts of these mental and physical processes. At present these include a "preperformance score consultation, a pattern-matching algorithm," a "score-matching" algorithm, a library of "possible types of score matches," an "adjust tempo" algorithm, an amplitude-averaging algorithm, and a"lost-live performer"algorithm. This method of "farming out" what we believe are some of the most important and most frequent problems encountered during live peformance is not a sufficient modeling ofthe human cognitive capacity during performance, but it seems to be an effective strategy for our present purpose (see Baird, Blevins, Zahler [1] and [3]). These modules are called up prior to performance or during performance. The "pre-performance score consultation" is called when a score is loaded into the memory of the AICP. It enables the AICP to read a musical score and respond to the notation in that score in a manner not unlike its human counterpart. All other modules are called in "real-time," and include the library of "possible performance behaviors" which references four possible performance postures in reaction to a "best guess match": "verbatim," "amalgamated," "held through," and "rest match" (see Baird, Blevins, Zahler [2]), the"score matching algorithm" is responsible for recognizing where the live performer is in the composer's score, the "adjust tempo" algorithm is responsible for adjusting the rate of speed at which the accompaniment is performed, the "amplitude averaging" algorithm scales amplitude values according to the dynamics played by the human performer, and the "lost-live-performer routine" calls a sequence of actions to be performed by the AICP when it has received sufficient evidence that the live performer has lost his/her place in the score. Our task, then, has been to analyze the processes by which performers make choices when they are actively engaged in a live performance of chamber music and generalize those processes in a meaningful and useful adaptation for the computer. Composing for the AICP and The Question of Musicality When creating any new machine oriented performing tool we need to consider how and why it should be used. The need for an interactive computer performer has been demonstrated above, but once we have such a performer it is incumbent on us to justify its usefulness through the creation of a literature which at once interests and challenges the composer, human performer, and the listener. Following in the steps of many others we chose the flute to score our own composition in combination with the AICP. The combination seemed a logical one. The AICP will only accept monophonic input from the live performer, the flute (discounting the use of multiphonics) seemed a natural instrument to test the capabilities of the program. The relative simplicity of the pitched signal emitted by the flute makes for easy tracking of pitch through a "pitch-to midi-converter," and the availability of a number of highly skilled and respected flutists supported the decision to write for this combination. Reasonable theoretical assumptions and the imaginative capacities of composers sometimes make "strange bedfellows." The firsttwo measures of the the composition ultimately realized (example No. 1) pointed the way to a composition which, while it would certainly challenge the capadties offthe program, possibly asked more questions about its use as a compositional resource than it answered. The interpretation of this score segment, especially in terms of MIDI data, brings fourth a number of interesting problems regarding the program's capabilities and how composers think about the performers they employ. Example No. 1 A' #~-u r JOL IL Hwllme A Lt FAN A ti AA -. _ $1 P f] j z--- pa C For a skilled human performer, decoding this passage presents little challenge. In fact, the only note of clarification needed would be an explanation of the "X" note heads as percussive attacks using key slaps (an idiosyncrasy of the composer). But what of the AlCP? What knowledge does it come to a rehearsal or performance with to sort out even as tame of a departure from convention as this? As mentioned above we have tried to equip the AICP with the necessary knowledge to gain as much from a printed score as its human counterpart. Upon loading a score into the program a "preperformance score consultation" is carried out by the AICP. This consultation allows the AICP to learn the performance specificities of the composition: tempo, dynamics, what the common practice use of "key" (or lack of it) is, where tonic cadences are found (if they exist), to acknowledge the value of the smallest rest and shortest note as dictated by the composition in conjunction with the tempo, decide on the the range of gross tempo fluctuation to be allowed, locate possible "jump points," etc. The score, as the composers blueprint for performance, is the source for any possible performance of, and the guide to any deviation from, the composition under consideration. In order to monitor the feedback loop which is obligatory for any ensemble performance situation the artificially intelligent performer must have the capability to keep track of and make decisions about how the performance is progressing. Consequently, we include modules such as the library of "possible performance behaviors," which, through a number of different scenarios (during performance), justifies the actions of the human performer when that player deviates from the notated score in ways thatdo not significantly undermine the progress of the composition. The "amplitude-averaging algorithm," called during the slack time between ticks of the computer's dock, balances the amplitude of the signals controlled by the AICP when the human performer deviates from the dynamics indicated in the score. ICMC 337

Page  338 ï~~Unfortunately, these safeguards still allow a great many features of performance to go unattended. They do not acknowledge the importance or consequence of other subtleties which contribute to interpretive performance. A communication gap exists between the composers instructions for performance, the score, and the actual physical requirements in bringing fourth sound from acoustic instruments. Looking, again, at example 1, in addition to the usual concerns for pitch, duration, time, and amplitude, theAICP must ask: what does the "x" note head imply; how is the fermata with the superscript breve to be interpreted; how are the smaller notes ("grace notes") to be included in the performance; whatdo the slashes on the notes denote; what does the dot above the note mean; and how is the fermata without the brevesuperscript different in performance from the earlier encountered fermata. To be sure, other questions arise as well, but we must recognize the fact that our accepted music notation makes reference to a large number of performance techniques that are implicit in the definitions refined in the creation of the composer's score. These techniques are inseparable from the composer's instructions and are understood as such by virtue of the call for a particular kind of sound. We must ask ourselves what consequence these implicit techniques have in performance and how we may bridge the gap between the physicality of performance and the composers score. If the composer has notated the score in such away as to define his expectations for performance in MIDI, many of these questions are already answered for the AICP. Functions most easily stipulated in the time and pitch fields where they incorporate the damping of note values, prolongation of individual note values, and changes of pitch and amplitude over time supply some of the information necessary for an acceptable performance. When mode of attack indications lie outside the time and pitch domain or become terribly complex exaggerations of these values things become fuzzier. We can easily program many of these note attributes to synthesize desired sounds, but the relationship is not reciprocal. The operation of our AICP is dependent on tracking a signal represented by a MIDI stream through parameters of pitch and time, any distortion of that signal leads to significant problems regarding the capture of event-based data and the correlation of live performance to score. In addition, while encoding MIDIvalues for as many of these techniques as possible is a start, it is no a solution, it only avoids our asking ourselves more difficult questions. Looking further on in the composition we find other problems posed by demands composers make on traditional instruments. Our notation reflects modifications of the flute's timbral components through the introduction of noise, ubreathiness" and harmonics, an alteration of the natural harmonic series of the instrument (example 2). These examples carry with them performance expectations which lie beyond Example 2: 4 J b, e atllo / simple pitch/time domain coordinates. An analysis of these timbral alterations reveals the intrusion of non-linear components into a system that demands the flow of a continuous stream of well defined event-based data for which MIDI protocol was essentially created. In the world of live performance composers take for granted the physical consequences these techniques make on performers and their instruments. They are, in many cases, manifested by time and pitch transients too subtle to be communicated effectively through the MIDI bandpass, rendering MIDI an inadequate source of communication for such subtleties. Instrumental performers have traditionally compensated for the difficuties encountered in producing such timbral transformations, but these intuitive practices are learned as idiomatic expressions specific to the instrument under consideration. Any serious inquiry as to how to generalize such a phenomenon begs a broader question. That question asks if the encoding of the pitch, time, timbre (in extremely elementary ways), and velocity components of a work result in sufficient information for the interactive performance of a composition faithful to the conception of the work by the composer? It is especially evident that real-time timbral variations in acoustic instruments are not being dealt with today in ways that allow a specification such as MIDI to communicate these variations effectively in any useful way. As we begin to produce more and more impressive machine performances, the knowledge base on which we draw for them begins to look inadequate. Similar problems have arisen in the past. Our colleagues engaged in digital signal processing and sound synthesis struggled for years to find appropriate algorithms to simulate the complexity of timbre found in string instruments like the violin. The motion of the bow, pulled and pushed over the strings at a constantly changing velocity and pressure while simultaneously changing the left hand frequency of vibrato, made it impossible to find a "synthetic" solution to the problem of generating such a sound through the synthesis methods available at the time. Cheap real-time "sampling" was a breakthrough that changed the way we thought about the problem and it provided new data with which to model new synthesis methods. That data produced models which, when taken together with basic psychoacoustical knowledge defining perceptual properties of spectra and mode of attack transients, allowed us to produce enormously realistic copies of acoustic instruments while "cheating" on the data originally thought necessary for such sounds (see Strawn [11]). The idea of "cheating" on data necessary for producing a particular sound or type of event may be a necessary step in our quest to model the natural world and human behaviors, but again, it is not a solution. We can get the AICP to react to a live performer in what we deem an acceptable mode of performance, but are we modeling the behavior necessary to simulate performance by a live instrumentalist, human behavior? Yes and no. It should be clear that the result of artificially supplying our AICP with information which will allow it to filter certain of the behaviors of the ambient sound transmission of our human performer gets us an acceptable performance, a performance which might even be used as a successful example of a Turing test, but it is the same sort of "cheating" that has taken place elsewhere in our discipline and it does not answer the essential questions about the processes of the mind during human music performance. If we accept programs like the AICP as members of the instrumental ensemble replete with the limitations that other, more traditional, instruments have, "confusing the means with the end" might not seem a terrible payment for what we have, If we are truly tring to model the human knowledge base for this behavior we have quite another story. P sf, F ICMC 338

Page  339 ï~~The inter-action between human performer and "synthetic performer, to date, is an elementary approximation of the actual tracking capacities used by human performers. The following excerpt from ourcomposition (example 3) shows the flute partacompanied by the AICP (for convenience the AICP part has Example3:" been gathered into a "grand staff" without regard for the various timbres it is using). When faced with the coordination of two parts such as these we must ask ourselves what obligation each of the performers has to one and the other, and how that information can be effectively transmitted to the AICP. These inquiries, once again, beg a broader question, a question which asks what is interpretive performance and how may we reproduce it under the constantly changing dynamics of live performance? To understand interpretive performance we need to begin to reexamine "event-based" data structures and their effectiveness for performance in the area of computer music. The time and pitch transients which define "gestural data structures" must be incorporated into the "performance model" which can produce effective lifelike performances. Research, then, must continue to concentrate on parameters beyond the detection of whatresults as individual musical lines or parts and focus on the coordination of those parts as they mesh with the composers score. When we begin to speak of the "musicality of a performance" conversation usually turns away from the the technical aspects of a performance (the mechanics of playing the instrument). We talk about "getting beyond the notes," but what does this really mean? If we are speaking of "interpretive performance" as simply the grouping of phrases, periods, sections, etc., in coordination with changes in tempo, then a simple segmentation of a composition based on tempo warping will be sufficient to model this activity. Our belief is that we are talking about something much more complicated and that as we multiply the number of people involved in a composition the problem becomes exacerbated. The consequence of these actions on the performance actually given and the means by which this information can be related to the AICP is a terribly difficult and tedious task. But these are issues which, for the most part, are overlooked by researchers, including ourselves, because of the effectiveness with which most tracking algorithms can be made to work. Returning to example 3, we note that in order to do this, each player pushes the other just a little bit further, trying to break through to the next climactic passage. These types of performance patterns are abundent throughout music litterature and depend on each note of the passage being interpreted with an individual nuance of its own. While we expect the flutist as part of his/her recreative duty to take on this responsibility, the part entrusted to the AICP can only be encoded with one set of parameters for playing this passage. These parameters will seem to be interpreted differently each time as the live flutist's reading of the passage changes, but the part of the AICP will never really initiate anoriginal interpretation of its own. Itwill never set a new tempo, rephrase a passage, initiate a ritard, etc. Conclusion and Speculations In the final analysis, tools for interactive performance such as the AICP are a major step forward in the collaboration between people and machines, but they lack many of the essential ingedients that go into a truly "musical performance." In order to get to the bottom of this problem we need to model the actual mental processes in question. When we can truly ask the correct questions about this process of interpretive interactive performance we get to the heart of what "musical knowledge" really is. This use of musical knowledge is what allows composers to to use the human/machine collaboration in a truly imaginative and creative way. For now we will be content to use the tools we have available to us and continue to be as resourceful as possible in programing them to make performances which are "acceptable." We continue to probe the questions within our minds for alternate ways of solving these problems and in so doing are humbled by how much yet there is to know. Bibliography [1] Baird, B., Blevins, D. & Zahler, N. (1989). "The Artificially Intelligent Computer Performer. Proceedings: The Arts and Technology II(pp. 16-23). New London, Connecticut: Connecticut College Press. [2] Baird, B., Blevins, D. & Zahler, N. (1990). "The Artificially Intelligent Computer Performer. Interface: Internatonal Journal of Interdisciplinary Studies, Vol. 19, No.1, Amsterdam, The Netherlands [3] Baird, B., Blevins, D. & Zahler, N. (1990). "Artificial Intelligence and Music: Implementing an Interactive Performer." (unpublished manuscript) 1991. [4] Baird, B., Blevins, D. & Zahler, N. (1989). "The Artificially Intelligent Computer Performer on the Macintosh ii and a Pattern Matching Algorithm for Real-time Interactive Performance." Proceedings, 1989 International Computer Music Conference, 13-16. [5] Bloch, J. J. & Dannenberg, R. B. (1985). "Real-time Computer Accompaniment of Keyboard Performances." Proceedings, 1985 International Computer Music Conference, 279-289. [6] Boulanger, Richard. "Conducting the MIDI Orchestra, Part 2: Interviews with Noel Zahler, Roger Dannenberg, Gary Lee Nelson, and Todd Machover." Computer Music Journal Fall 1991, M.I.T. Press, Cambridge, Massachusetts. [7] Boulanger, Richard. "Conducting the MIDI Orchestra, Part 2: Interviews with Noel Zahler, Roger Dannenberg, Gary Lee Nelson, and Todd Machover." Computer Music Jouma4 Fall 1991, M.I.T. Press, Cambridge, Massachusetts. [8] Dannenberg, R B. (1984). "An On-line Algorithm for Real-time Accompaniment." Proceedings,1984 International Computer Music Conference, 193-198. [9] Dannenberg, R & Mukaino, H. (1988). "New Techniques for Enhanced Quality of Computer Accompaniment" Proceedings,1988 International Computer Music Conference, 243-249. [10] Roads, Curtis B. "Artificial Intelligence and Music," Computer Music Journal, Vol.4, No. 2, Summer 1980, pp. 13-25. [11] Strawn, J. "Approximation and Syntactic Analysisof Amplitude and Frequency Functions for Digital Sound Synthesis." Computer Music,umna, Vol. 4(3), 1980, pp. 3-24. [12] Vercoe, B. (1984). "The Synthetic Performer in the Context of Live Performance." Proedings,1984 Intemnational Music Conference, 199-200. [13] Vercoe, B. & Puckette, M. (1985). "The Synthetic Rehearsal: Training the synthetic performer." Proceedings, 1985 Intemnational Computer Music Conference, 275-278. ICMC 339