Page  00000173 CREATING VON FOERSTER'S NON-TRIVIAL MACHINE IN DIGITAL AUDIO Dr. Arun Chandra The Evergreen State College Performing Arts ABSTRACT An implementation of von Foerster's non-trivial machine in software, for the generation of simultaneous waveforms that modify their own structure based on the current structure of their neighbors. This research creates a contemporary version of counterpoint, but one that takes advantage of the computer's ability to avoid notes. Thus, instead of "note-against-note" behavior (as was the case with 16th and 17th century counterpoint), this research explores "waveagainst-wave" behavior. The construction of each waveform allows for the linking of square, triangular and curved elements in any combination. Thus, a variety of timbres is the structural output of the waveform. Each element in each waveform has from six to twelve parameters, and on every iteration each parameter is linearly changing. So, the frequency, amplitude and timbre of the waveform is gradually shifting. Sudden change can occur if neighboring waveforms reach defined thresholds. Keywords: Sound synthesis, Aesthetics, Algorithmic Composition, Software Systems. 1. AESTHETICS AND PREMISES Lejaren Hiller's work in the 1950s and 1960s opened the doors for composers interested in using the computer as an assistant in the task of musical composition [6]. As a composer, he used a wide variety of styles, techniques, and genres in his work: tonality, aleatoric methods, serialism, quotation and montage are all present in his considerable output. When he used the computer, he used it to compose in ways that would not have been possible without it: he did not use his knowledge of the aesthetics of his past as criteria to determine the compositions of his future. To this extent, composers are still looking at the possibilities of using a computer for the creation of those compositions that could not have been made in any other manner. As Martin Supper has written, "Computer music is music that cannot be created without the use of computers." [10] Likewise for the aesthetic challenge: can a work of art be created whose aesthetics are as radical in their outcome as the technology attempts to be? It is understandable, and thus for me regrettable, that a pre-technological aesthetics permeates a post-technological artistic world. Using the most contemporary technology, composers reproduce compositional aesthetics of the 19th and early 20th century. Herbert Brtin, in his book Uber Musik und zum Computer [3] quoted a letter from Mozart to his father Leopold, on the occasion of Leopold's birthday, in which Mozart wishes him to live as long as "... there is no more to be made from music that is New." Brtin says that this shows how much Mozart understood and expected that what is "New" is undiminishable, and that musicians will constantly be deriving from "music" that which is new. That is what this project aims at: Can the computer be used to compose sets of relationships between waveforms, and introduce the idea of counter-wave rather than counter-point? For, as is well known, point-against-point rule structure of 16th and 17th century counterpoint (on which Hiller's Illiac string quartet was based) was premised on the idea of note. Since we don't have to have "notes" with a computer, what would that mean for coetaneous relationships between voices? How should they be structured? 1.1. Structural Ethics. The need for "artificial intelligence" and its hope lies in the creation of an intelligence that does not only do what humans can do without it. Can we posit a thoughtful strategy that is desirable and not-yet-human? Heinz von Foerster suggested in his paper From Stimulus to Symbol: The Economy of Biological Computation that Subjects Si and S2 are coupled to their common environment E.... [E]ach of these subjects is confronted with the additional complication of seeing his environment populated with at least one other subject that generates events in the environment E. Hence S2 sees, in addition to the events generated by E, those generated by Si, and since these take place in E, they shall be labeled El; conversely, subject Si sees in addition to events generated by E those generated by S2 which will be called E2. Thus, in spite of the fact that both SI and S2 are immersed in the same environment E, each of these subjects sees a different environment, namely, Si has to cope with (E, E2) and S2 with (E, El). In other 173

Page  00000174 Figure 1. Structure of subjects S, and S2 sharing environment E. words, this situation is asymmetrical regarding the two subjects, with E being the only symmetrical part. [12] Thus there is a closed but asymmetrical set of interactions, in which S, and S2 are not directly changed by each other, but change themselves in the presence of a common but not shared universe. S, sees the environment El, and S2 sees E2, and based on what each sees, each chooses path. What would happen if sound were an isomorph of the above structure? 1.2. The Aesthetic Imperative. Information Theory [9] has shown the relationship between the channel of communication and the message being communicated. The teleology of the artistic process results in the aesthetic imperative, and is the link between the artwork and its social function. Since the structure of any language determines the possible messages that can be sent with it, the social imperative is then not to abandon language (which would be an "asocial" imperative) but rather to create such messages as resist and overflow the limitations of their communicative channels, thus linking the social imperative with an aesthetic one. By way of example from the 18th century, in composing multi-voiced compositions for single-voice instruments such as the violin and the cello, J.S. Bach mapped a structure of musical thought onto an instrumental structure that resisted it. In a sense, he did not write for the instrument but against it: his musical idea transformed the possibilities of the instrument. Likewise, his multi-voiced works for plucked-keyboard instruments posed a challenging compositional project: can a successful multi-voiced texture be created using an instrument tightly constrained with regard to its range of timbre, volume, and articulation? All distinctions between voices then have to be made using only tessitura, rhythm, and duration. Art is a fertile ground for the exploration of new languages, and such art needs an audience that is also lookingfor something beyond the limits of their languages, messages that cannot be sent using the currently existing ones, but recognize the need for sending and receiving messages that have not been tamed by language itself. Art is a defense against the desperate need to make sense. 1.3. Towards Counterwave. Counterwave is a program that allows the composer to explore the composition of relationships between waveforms. It is intended as a workshop tool, a virtual machine whose output is not necessarily a composition. In this, I take up an idea of Curtis Roads, who wrote that a digital computer is "... a system for creating virtual worlds, an ideal testbed for compositional experimentation and invention." [8] Counterwave allows a composers to specify the structure of a waveform, the speed at which it changes, and how it can modify its structure based on the coetaneous states of its neighboring waveforms. 1.4. A non-trivial machine. Heinz von Foerster invented the notion of a "non-trivial machine", referring to the "well-defined properties of an abstract entity", and not necessarily something with "wheels and cogs". The salient aspect of a non-trivial machine is that its "input-output relationship is not invariant, but is determined by the machine's previous output. In other words, its previous steps determine its present reactions." [12] When non-trivial machines are run simultaneously, they can be so constructed as to determine their future state in relation to the current state of their neighboring machines. Counterwave is an attempt to create a machine that changes its internal properties based on its observation of the current properties of its neighboring machines. 1.5. Structural and musical coordination. The musical context for the idea of the "non-trivial machine" comes from the exploration of ideas of simultaneity and parallel behaviors. Since with the computer, a composer is not required to use the structural constraint of note, the old idea of counterpoint is no longer sufficient for describing systems of coordinations. (As an aside, Gytargi Ligeti's idea of micro-polyphony is a wonderful technique for its emergent behaviors, which are not those of clearly delineated multiple-voices.) Instead, the coordination of behaviors of separate voices is the result of "polling" (or "observing"), and the machine changing its structure on the basis of what was "observed". Thus, parallel voices have their independence from one another, but only in a system of tight coordinations. This system is a model of one in which its members do not change each other, but rather, based on observation and self-observation, change themselves based on their own, mutually independent rules. 174

Page  00000175 2. COUNTERWAVE Counterwave allows for the creation of systems wherein up to eight "voices" can interact with each other. There is nothing inherent in the program that would disallow a greater number of voices, however, with eight voices, there are a total of: 28 pairs, 56 trios, 70 quartets, 56 quintets, 28 sextets, 8 septets; and 1 octet, 0 50 100 150 200 250 30( Duration in samples Sampling Rate: 44100sps state600Length(260 samples) Freq(169615 Hz.) maxDisplacement(8898dB) I ~ which seems to be a sufficient amount of variety to begin with. These values were determined by the number of combinations that are possible without repetition, meaning: (1) r! x (n - r)! Figure 2. States 100 and 600 of a "voice". to -x. Each variable thus oscillates between its minimum and maximum value, creating a cycle length of: where n = 8, and r = {2,3,4,5, 6,7,8}. 2.1. Construction of a "voice". A "voice" consists of a sequence of elements, and an element can be one of the following shapes: square, triangle, or curved. A voice can consist of all the same type of elements, or of different elements. This sequence of elements is iterated, and each iteration of a sequence is considered to be a state of the sequence. At each iteration, each element changes in different ways, depending on the element. 2.2. Constructing an Element. For each element, the minimum and maximum limits for its duration and amplitude are specified. "Duration" (D) is specified in samples, and "amplitude" (A) is specified as a 16-bit integer between ~32767. Each element changes its D by dx and its A by ax on each iteration: sD(dx, dmax, dmin) (2) square A(ax, amax, amin)] If the element is triangular or circular, there are two more variables: the amplitude P and the location L of the "peak", their minimum and maximum limits, and the amount by which they change on each iteration: D(dx, dmax, dmin) triangular [ A(ax, amax, amin) or circular L(lx, Imax, Imin) P(px, pmax, pmin) Each variable D, A (and L, P if appropriate) changes by its x on each iteration. Upon reaching its limit, x changes 2 x (max - min) N iterations = x (4: Thus, each element can have up to four variables, each variable oscillates with an independent cycle length. 2.3. "Voice": A Sequence of Elements. A "voice" is defined by specifying a set of elements, as above. These elements are placed in a sequence. On every iteration of the sequence, each elements' variables change by each one's x value. Figure 2 is an example of a snapshot of a 'voice' taken at states 100 and 600. As you can see, small changes in each elements variables causes significant transformations of the 'voice' ovei time. Figure 3 are plots of FFTs taken of the states shown in Figure 2. They show that not only does the shape ol the waveform change over time, but its frequency contenl changes as well. 3. EXAMPLES OF "VOICES" 3.1. Voices using "Square" elements. Figure 2 shows a voice consisting of eight "square" elements in its 100th and 600th iterations, and Figure 3 shows the FFTs of those iterations. It should be apparent that the frequency spectrum has changed a great deal, along with the overall amplitude and fundamental frequency. 175

Page  00000176 Frequency (in Hertz) FFT of State 600..JJJ^ ^.^^^^i^.^^.J. ^^ ^FETWndowSize =256 Samplingte= 44 _ _ 1^^^^^^^ Figure 3. FFTs of the "voice" shown in Figure 2. 3.2. Voices using "Triangular" elements. Figure 4 shows examples of "triangular" waveforms, at their 300th and 900th iteration, followed by their FFTs. Note the increasing density of higher frequencies in the spectra as compared to the "square" waveforms above. 3.3. Voices using "Circular" elements. "Circular" waveforms are similar to the "triangular" ones in terms of their construction, except that the "sides" of the triangle are "curved". The curves are calculated by specifying two points (xl, yl), (x2, Y2) and a point between them (x3, y3). The points (xl, yl) and (x3, y3) are connected by a curve, and (x3, y3) and (x2, y2) are connected by a second curve. Figure 6 shows "circular" waveforms in their 200th and 600th iterations. Predictably, their FFTs show that these "circular" elements have far fewer high frequency components in their spectra as compared to the "triangular" or "square" elements. 4. AUTONOMY AND SELF-MODIFICATION A voice can modify any or all of its parameters, depending on what its neighboring voices are doing at a given time. Each "voice" records aspects of its current state in a "window" variable, accessible by all other voices. This "window" contains: * its current minimum and maximum amplitudes * its current amplitude range * its current duration (the sum of its elements' durations) * its current number of "sounding" elements * whether or not it is "silent". S200 400 600 800 10 sample Fourier t....f.... f state 300 FFT size: 8192 Freq....y.... lution: 5 Hz peak freq: 75 o 1000 2000 3000 4000 5000 6oc freq....ies tin Hertz) 2 00 400 600 800 10 sample Fourier t....for...f state 900 FFT size: 8192 Freq....y.... lution: 5 Hz peak freq: 350 H:-~nN^ *.,v Figure 4. Two states of a "triangular" voice, with FFTs 176

Page  00000177 0 50 100 150 200 250 300 350 400 450 50C samples Fourier transform of state 200 FFT size: 8192 Frequency resolution: 5 He I peak frog: 695 He A 0 1000 2000 3000 4000 5000 600 frequencies (in Hertz) state: 600 Length: 181 samples Segments: 8 0 50 100 150 200 250 300 350 400 450 50 samples Fourier transform of state 600 FFT size: 8192 Frequency resolution: 5 He peak frog: 5 He Upon beginning each iteration, each voice checks the windows of all other voice in creation. After reading this information, a voice can either ignore it, or act on it. If ignored, the voice continues its configuration of changes undisturbed: it retains its autonomy. If not ignored, the voice can: * go from sounding to silent, or * change the "type" for some or all of its elements (i.e., change from square to triangular, or from triangular to circular, etc.), or * change the duration variables for some of all its elements, or * change the amplitude variables for some of all its elements, or * ignore its neighbors for a specified amount of time, and continue with its own set of transformations. Every change made to any of its variables may, instantaneously, change the cycle length for that variable. The criteria used to decide whether to change a voice's parameters can be relatively simple, such as: if x other voices are present, change some parameters or variations on it. The criteria can also be more affectionate: jealousy: if another voice's dynamic range is greater, matcF its dynamic range. revulsion: if another voice's limits are within x, change all limits by a factor of y. me-too: if another voice's rates of change are faster, increase rates by a factor of x stubborn: if any other voices are sounding, repeat the current state. shy: if any other voices are present, remain silent. loud-mouth: if any other voices are present, make the dynamic range greater than the largest of the others. 5. OBSERVATIONS ON RESULTS 1. Relatively prime cycle lengths generate richer harmonic fields than cycle lengths that are multiples of each other. 2. When the length of a state generates a fundamental that is below 20Hz (sub-audio), the resulting sounds resemble those of heavy machinery. 3. A state can generate a temporary steady pitch if the the rates of change of its elements' durations add up to zero, or close to it. We hear the repetition of the state's length (or near repetition) as a steady pitch. Figure 5. "Circular" waveforms and their FFTs. 177

Page  00000178 4. When adjacent elements have amplitude relationships that are less than 60dB apart, their movement relative to each other seems to have no consequence on the resulting spectrum, although the lengths of the segments have a consequence on the fundamental frequency of the state. There ought to be an intelligent way to address this relationship. 5. The flexibility in sequence specification for elements, and the possibility of elements repetition within a state, was done on the hypothesis that the parallel movement of separated elements could generate a second "fundamental" frequency, i.e., an overtone whose amplitude was as strong as the fundamental. This did not happen. Occasions have occurred of radical shifts in timbral presence (as I hope are hinted at by the above FFT plots), but I cannot yet successfully predict them. 6. Currently, there is no reasonable way to organize the data controlling the relationships between the voices. Changes in the relationships require modification of the source code. An useful organization would allow for greater flexibility of experimentation. 6. NEXT STEPS An aspect of Von Foerster's "non-trivial machine" that has not yet been addressed, is the ability of a voice to store its sequence of changes, and use that sequence as a further determinant of future behavior. This would be a variant of the old "Markoff-chain" algorithm. For example, there could be a counter (perhaps called "fed-up") for that might determine the number of passes to ignore other behaviors, but then initiate. Or, perhaps, to have some type of history "filter" that could change the significance of the perception of other voices. In short, von Foerster suggests that the non-trivial machine should change itself as a result of auto-observation: currently it does so as a result of allo-observation. 7. ACKNOWLEDGMENTS I'm grateful to Herbert Brun for his insights into music composition and the potential of technology. The fundamental ideas for structuring waveforms come from his waveform-synthesis program Sawdust. Dr. Jerry Keiper and Robert Naiman (both formerly of Wolfram Research) generously contributed their skills in mathematics and numerical analysis, and patiently answered the many questions I had on implementation. All plots in this article were generated with gnuplot. gnuplot ran as a child process under Counterwave, plotted the data, and converted the output to Postscript. The FFT algorithms used for generating the plots were taken from Numerical Recipes in C. [7] Counterwave is written in standard C, using portaudio for sound playback, and runs under OS X, Windows XP, and Linux systems. It is available from the author at 8. REFERENCES [1] Ashby, W. Ross. (1956). An Introduction to Cybernetics. London: Metheun and Company, Ltd. [2] Brtin, Herbert. (2004). When Music Resists Meaning. Middletown, Conn.: Wesleyan University Press. [3] Brtin, Herbert. (1971). Uber Musik und zum Computer. Karlsruhe, Germany: G. Braun Verlag. [4] Chandra, Arun. (1994) "Linear change of waveform segments causing non-linear changes of timbral presence." Contemporary Music Review, Vol. 10, Part 2. Harwood Academic Publishers: September, 1994. [5] Chandra, Arun. (2002) "Sequential Waveform Composition". In Miranda, Eduardo. Computer Sound Design: Synthesis Techniques and Programming. Oxford (UK), Boston (USA): Focal Press. [6] Hiller, Lejaren A., and Isaacson, Leonard. (1979). Experimental Music: Composition with an Electronic Computer. Westport, Connecticut: Greenwood Press. [7] Press, William H., et al. (1988). Numerical Recipes in C: The Art of Scientific Computing. New York: Cambridge University Press. [8] Roads, Curtis. (2001). Microsound. Cambridge, Mass.: MIT Press. [9] Shannon, Claude and Weaver, Warren. (1963) The Mathematical Theory of Communication. Urbana, IL: University of Illinois Press. [10] Supper, Martin. (2001). "A Few Remarks on Algorithmic Composition" in Computer Music Journal 25.1 pp. 48-53. [11] von Foerster, Heinz. (1984). Observing Systems. Seaside, California: Intersystems Publications. [12] von Foerster, Heinz. (2003). Understanding Understanding. Essays on Cybernetics and Cognition. New York: Springer. 178