Page  363 ï~~Circularity in neural computation and its application to musical composition Georg Hajdu Center of New Music and Audio Technologies (CNMAT) University of California 1750 Arch Street, Berkeley 94709 Abstract Backpropagation neural networks and their simulations can be useful for composers and theorists. Yet, their usefulness and general applicability depends mainly on the nature of the training data presented to the network. Certain musical parameters are easier to formalize than others, and in the case of parameters that exhibit a circular characteristic like pitch or key space--which have been represented as a winding helix (Shepard) or as a torus (Krumhansl&Kessler)--trigonometric functions can provide the means to prepare the training sets. Introduction In this paper, I shall describe two environments in which the MAX multi-layer perceptron object, developed by Michael Lee at CNMAT, was used to control circular musical parameters: Firstly, an interactive model of key space in which a particular location on the torus is transformed into tonal hierarchies. These hierarchies, in turn, serve as the basis for a stochastic process generating melodies in a given key. Movement on the surface of the torus leads to a change of tonal hierarchies perceived as modulation. Secondly, a compositional environment I used in my opera in progress "Der Sprung." Motivated by the text, a set of networks were trained to different melodies and used for the subsequent melodic and metric interpolation process. The perceptron training file format requires the configuration of the net to be specified as well as the number of training patterns. A typical pattern includes data for both the input and the desired output associated with it. Key Space The most convenient way to establish a graphical model of the toroidal key space is to project it onto the plane of the computer screen: r-b------ ------d#--. D Bb F# # d bb P 'A F Db a f c#I EC Ab e C g# G Eb B ' -b-- ------ -.-------.d#D2 d2 < >dl A unique position on the surface of the torus usually represented in terms of two angles.p 1 and W)2 with respect to a reference point (lower left-hand corner) can be expressed as: (pl=dl/Dl*2xt.p2=d2/D2*21r I C M C P ROC EED I N G S 199536 363

Page  364 ï~~Translating (p1 and (p2 into training data can be achieved by taking their sine and cosine xl= sin((p1) x2= cos((p1) yl= sin((p2) y2= cos(p2), before dividing the variables by 2 and adding 0.5 to them to match the input range of the MAX perceptron object (0.-1.): x' 1= xl/2+0.5, x'2= x2/2+0.5, y'l= yl/2+0.5, y'2= y2/2+0.5 According to Krumhansl&Kessler, a key area is characterized by a unique location on the torus and a particular pitch profile representing tonal hierarchies. For the key space model programmed in MAX, the perceptron (4 input units, 12 hidden units in 1 hidden layer, 12 output units) was trained to 24 patterns for the 24 major and minor key areas. Each pattern consists of 4 input values for the location of a particular key area and 12 output values for its respective profile (pitches in chromatic order). For example, the following line shows the pattern for C major: x'1 x'2 y'l y'2 c c# d d# e f f# g g# a a# b 0.2500 0.0669 0.2500 0.0669 1.0000 0.0000 0.3033 0.0242 0.5218 0.4514 0.0703 0.7184 0.0387 0.3471 0.0144 0.1576. After the training period, the network performs the task of interpolating between the profiles (perceived as modulation). Melodic interpolation Melodic interpolation is far from being a trivial problem (Polansky). The intervallic space in which interpolation is performed is characterized by at least two distinctive properties: pitch distance (magnitude) and harmonic distance. While pitch distance is usually a function of the frequency ratio of the two fundamentals, harmonic distance is more complicated to formalize and depends a great deal on intuition and experience. I found that the circle of fifths represents a viable compromise between the two principles, in which the circularity of the chroma circle is preserved and harmonic relationships are allowed to come into play. Enharmonic equivalence The following method for melodic interpolation is pulse oriented and suggests the use of two separate networks for the melodic and rhythmic domains. Theoretically, the number of possible "source melodies" can be as large as desired. Nevertheless, for the MAX perceptron object it should be kept between 2 and 12. The training patterns for the "melodic" network consists of m values serving as indices for the m numbers of melodies involved, e.g. 0. 1. 0. for the second of three melodies. The output values for n pulses indicate the position of a particular pitch on the circle of fifths. For a closed circle of fifths with enharmonic equivalence, two values per pulse are necessary, as opposed to an open circle where one value is sufficient. In the first case, one value is taken from the circle based on C, the other from the circle a tritone below (the distances are determined in clockwise direction): a1,=(Dc(T1(n)))/12, bl,=(DGb(T(n)))/l2 a2.n=(Dc(T2(n)))/12, b2.n=(D~b(T2(n)))/12 Dk(l)= Distance of pitch class 1 on the clockwise circle of fifths with respect to pitch class k, e.g. 1 = De(G). (Tm(n))= Pitch class of melody m for pulse n. An additional cm,, value would be required if also taking octave register into consideration. After interpolation, the resulting values (a12., b1.2. ) are subject to further interpretation. The smaller of the two values is sampled and an offset is added to it (offseta=O, OffSetb=O.5): v= a12..., if a12. < b.2..,n; v= b1.2....O +.5, if a1.2..n --bl.2.. The pitch class is determined by (T12(n) = Ic(Int(V* 12+.5) mood 12). Ik--Inverse function of Dk(l), e.g. G=k(1). 3 64 4ICMC PROCEEDINGS 1995

Page  365 ï~~Der Sprung In my opera Der Sprung, enharmonic equivalence was not a desired feature. Therefore, the bmn values could be disposed of. Instead, I used an open progression of fifths spanning from eb to g#. The following measure shows an example for how a short motif was translated into training data (3. scene, measure 117, saxophone): The coding for pitch on a 16th-note quantization level (the network consisted of 3 input units, 3 hidden units in 1 hidden layer, 24 output units) is as follows: 1.0. 0. 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.42 0.42 0.42 0.75 0.5 0.42 0.17 0.17 0.17 0.17 0.17 0.17 0.75 0.75 Rhythm was dealt with in an analogous way: 1.0. 0. 0.00 0.00 0.00 0.00 1.00 0.50 0.50 0.50 0.50 0.50 1.00 0.50 1.00 1.00 1.00 1.00 1.00 0.50 0.50 0.50 0.50 0.50 1.00 0.50 Here, the values starting after the third number indicate the sound status sm,, of a particular pulse. sm.n =1 if pulse is attacked, =.5 if pulse is sustained, =0 if pulse is silent. After interpolation an intermediate value (.33 -.67) can still lead to an attack if preceded by a value smaller than.33. The actual interpolation involved three melodies (H, S and R) located at equal distance in the corners of a triangular melodic space and was performed over a total period of three minutes. The following graph shows the interpolation path at a resolution of 1000 sample points. For the graphical plot, another network was trained to translate the input data into spatial coordinates. S References Hajdu, G. (1994). Der Sprung - First Act. Ph.D. Thesis, UC Berkeley. Krumhansl, C.L. & Kessler, E.J. (1982). Tracing the dynamic changes in perceived tonal organization in a spatial representation of musical key. Psychological Review, 89, 334-68. Krumhansl, C. (1990). Cognitive foundations of musical pitch. Oxford University Press. Lee, M., Freed, A. & Wessel, D. (1991). Real-time neural network processing of gestural and acoustic signals. roceed......i.............Ma Polasky L.(199). oreon orphlogcalmuttionfuntios. ecen tehniuesand eveopmnts Procedins o theICMC(Sa Jos), 5-60 heajd, G. (194). e iSrcunglart Acjdgt. Ph.D Thesai, Uich Be rley. h cosiaSceyofAeia Krumhansl-5.L.&Kslr..(92.Taigtednmccagsi ecie oa raiaini ICMC PROCEEDINGS 1995 365 365