Page  214 ï~~AUDIFICATION: The Use of Sound to Display Multivariate Data Gregory Kramer and Stephen Ellison Clarity/Santa Fe Institute Nelson Lane, Garrison NY 10524 914-424-4071 [email protected] Abstract: Audification, the technique of using sound to display multidimensional data, is described and a brief history is given. Sonic parameters of auditory data display are described. Parameter nesting, the use of basic sound variables on different time scales, is introduced. Our audification programming environment and its use in comprehending a 9-dimensional dynamical system is described. Related aesthetic, perceptual and practical issues are raised. Audification is the use of sound to display multi-dimensional data1. This technique was first introduced nearly 40 years ago but has seen little development2. 3-D graphics displays are frequently used in the attempt to comprehend multiple data dimensions. Such displays may represent as many as 8 dimensions (variables) simultaneously in one image3. Representation of more dimensions via audification would be a desirable outcome of future research. In addition, higher dimensional systems may have dynamic relationships between the variables that are too complex to understand via visual displays alone. Audification may bring comprehensibility to such higher dimensional dynamic systems. This technique enables the data analyst to employ the substantial pattern recognition capabilities of the auditory channel4. Background In the early 1950's Pollack and Ficks5 published a paper on the use of sound to display data which used a simple binary display technique. They took eight variables and had the test subjects determine whether each variable was in one of two states, eg. loud or soft, long or short, etc. They concluded that this was an effective technique for conveying data but that "extreme subdivisions of each stimulus dimension does not appear warranted." Later works by E. Yeung6 and S. Bly7 have since explored different techniques, including continuous variation of audible parameters. For an overview of work done to date, the reader is referred to S. Frysinger's excellent paper8. Basic Audification Variables A simple example of a direct audification translation might include the use of amplitude, frequency, attack time, and so on to represent system variables. Additionally, multiple timbre variables9 and spatial location may be used to encode data10. To represent higher dimensions, more complexity is required of the sound. This complexity may be obtained by.various means, for example by creating parallel audio streams (polyphony) and by generating a single sound stream with many levels and types of parameter variability. We have been using increasingly complex variables and the manipulation of variables on different time scales to achieve high dimensional modeling tools. Parameter Nesting We developed the concept of parameter nesting1 to achieve, in an orderly manner, these higher dimensions. Parameter nesting is the use of basic sonic variables on different time scales simultaneously. It enables us to wring 4, 5 or more dimensions out of the most basic variables, such as amplitude and frequency. ICMC 214

Page  215 ï~~Amplitude Nesting: There are five levels of amplitude nesting described. They may be summarized as speed, duration, envelope (instantaneous amplitude), cluster speed and master amplitude. Pulse Speed: If the signal is broken into discreet 'packages' by periodically reducing the amplitude to zero, a pulsing sound is the result. The speed of this pulsing has been found to be a useful variable. Duration: The length of time the sound package is sustained at full amplitude will be referred to as duration. In the case of minimum attack and decay (below) this may be understood as the duty cycle of the controlling waveform. Envelope (instantaneous amplitude): The attack and decay times of the pulsed signal are referred to as the envelope. By manipulating these rates one or two more dimensions can be represented. (Note 1: Instantaneous amplitude may also be employed for a tremolo effect, where a low frequency waveform is used to control the amplitude of the sound.) (Note 2: Instantaneous amplitude may technically refer to the amplitude of the waveform at the sample level. We refer here to instantaneous R.MS amplitude.) Cluster Speed: If a second amplitude waveform is imposed upon the first, a series of fade-ins and fade-outs of the pulsed sound will be heard. The period from the onset of one 'packet' to the onset of the next will be referred to as master speed. The useful range for cluster speed is approximately 1 Hz to 1 cycle every 4 seconds (.25 Hz). (Note: The attack and decay shape of these packets have also been employed as audification variables and found to be of some use.) Master Amplitude: The peak RMS amplitude of the master speed packets will be referred to as master amplitude. Pulse Duration Master vaials il ie smai de f h i oud.Exmle icld:laserp tch eirat Pule "::J%" A Cluter Sp d- ' [ Cluster lDuration -- Figure 1. Amplitude Nesting Parameters Frequency Nesting, where frequency is controlled within several ranges and time scales, may be understood in a like manner. Musical descriptions and implementation techniques of these variables will give a summary' idea of their sound. Examples include: Master pitch, vibrato depth, vibrato speed, vibrato waveshape, degree of quantization of pitch changes, etc. These same types of control variables can be applied to timbre dimensions such as brightness. For example, a control signal may be applied to brightness and the depth, speed and waveshape of the brightness control signal may then be manipulated. Additionally, the types of parameters accessible in commercially available digital signal processors have proven to be of some use. These might include reverberation time, 'flange' depth, speed and waveshape, resonance, and so on. Usefulness of the Paramneters - Parameter Overlap We have found that while cort~ain parameters may be audible, from a practical standpoint they either directly intefere with the perception of other variables or they simply distract fr-om this ICMC 215

Page  216 ï~~perception. The case of reverberation provides one example of a parameter that can mask the accuracy of many others. When reverb time is high, attack and decay times employed in amplitude nesting are smeared, as is brightness and, An examp with our 1 our other tions caus the subjec scale varh ceptualte mine their Also, sinc, different t -. lappe)d th4c effected di comes too Conversel definition._--" -.._- \.-- between p comes slo' A carefull Figure 2. 9-D Lorenz Data advantage model. F model in which independent variables describe the amount of vegetation and oxygen in the atnosphere. In such a system, when the amount of vegetation goes to 0, the amount of oxygen will also go to 0. In this case, oxygen can be mapped to cluster speed and vegetation can be mapped to pulse speed. When the amount of vegetation is small, pulse speed will decrease, as will cluster speed. Similarly, when there is a large amount of vegetation, fast, dense clusters of pulses will result. A Sample Audification: 9-Dimensional Lorenz Equations We have been using audification to contribute to our ability to comprehend complex dynamical systems. Working with scientists from the Santa Fe Institute, a non-profit think tank whose work is focused on the "Sciences of Complexity", audification has been applied to a set of nine equations described by Edward Lorenz". The thrust of this e cases, many other parameters are obscur)le of distraction was found unexpectedly use of amplitude nesting clusters. While variables could be perceived, the interrup-ed by the cluster envelopes interfered with:t's concentration on the other, shorter time ables. It will be interesting to perform persts on these parameters in order to deterr usefulness and interactions. e we are using the same audible variable on time scales, when these time scales are overere is a loss of clarity of one or both of the imensions. For example, if pulse speed beslow, then cluster speed looses definition. y, as cluster speed approaches pulse speed, is reduced. Such interferences also occur parameters. For example, if pulse speed bewer than vibrato speed, perception of the viiipaired. y designed audification system may take of parameter overlap to add meaning to a or instance, consider an environmental O-DIm Loronz: El 1N1no Figure 3. Lorenz Data- Mapped Pairs ICMC 216

Page  217 ï~~work has been twofold: (1) To aid in the investigation of chaotic phenomena, such as weather systems, which are frequently represented by such equations and; (2) To explore the practicality of our audification techniques with this multivariate data. The data was provided to us by Dr. Gottfried Mayer-Kress of SFI and the Center for NonLinear Studies at Los Alamos. The Lorenz equations were mapped to the following audification parameters: speed; attack time; duration; detune (of two identical sounds played simultaneously); brightness; flange (the 'send' level to a commercial DSP); pan; pitch; and amplitude. The 9 dimensions are displayed in graphic form in figures 2 and 3. It is clear from these displays that any tools are welcome which help us perceive meaning in this data. Description of an Audification System Introduction The Audification Programming Environment ("APE") is a flexible toolkit with which we develop and test our Audification techniques. APE was designed and developed for the Apple Macintosh with Max12. Sound is generated by the Digidesign SampleCell card, and processed with a Lexicon PCM70. We have created a palette of audification tools which are configured into systems. The data which drives the system is either generated within Max, read in from a text file, or received in real-time via serial communications or concurrent applications such as HyperCard. The size and complexity of these systems are restricted by memory and CPU type and speed of the Macintosh. Simple systems have been run on the Macintosh SE, however, most useful systems require a Macintosh II series computer. Audification Systems Design Since we are interested in developing a large number of audification techniques and applying these to many different types of data sources, we needed to design a system with sufficient flexibility and expansion potential. Every audification system consists of between 4 and 11 tools configured into systems. These tools are Max patcher objects, and consist of data sources, maps, destinations, and the central control tool audControls. audControls is used to channel messages between other tools, and serves as a central point from which the system can be initialized, started, stopped, re-configured via presets, and controlled from a remote application or computer. In general, sources provide the data which drives the sound, maps scale the information, and destinations make sound. By using internally consistent naming conventions and inter-tool protocol, we have been able to achieve a high degree of flexibility in configuring new audification systems. The selection and placement of tools give specific Audification Systems unique characteristics. New systems are created by connecting tools together in a new Max patcher in a systematic manner. Connection of audControls to and from each of the tools provide for flexible command and data flow. Src outputs are connected to map inputs, and map outputs are connected to dst inputs. A src output can be connected to any number of map inputs, but each map input and output must be connected to only one src output and dst input, respectively. SOURCES - The following demonstrate several ways of driving audification systems with data: Three parameters are generated according to a Lorenz dynamical system via srcLorenz3, (these equations are solved in Max using the 4th order Runge-Kutta method); real-time data is received via srcSysEx and uses the set of MIDI system exclusive commands described below to ICMC 217

Page  218 ï~~receive 10 parameters; five sine waves are generated with srcWaves, with control over period, resolution and phase shift; data is imported from text files with srcColl via the Max coil object. MAPS - Maps scale data from sources to rFile Edit New Mai Font Windows Options..:12 " ranges appropriate for [map,,o,2othe destinations, and X__ ___[___[_ __"_ssLorenzCo 4.... allow for flexible as--7I'l6 signment of inputs to -- I!] - outputs. This allows Â~Â~ '" Â~'Â~.2" the user to essentially 0-d >, --E D.cstcluster-dstSspleCel-dst/bend dstPCM7O dstStnpChwt5 disregard the dimen- 126 1 csionality of the data --uts_ _seDu___: e,,o flowing through the ] 0 0 0 dstStrlpChert5] 1000 >O100 1000 1000 t1000! 1 system, and map En1Z El.E 10[ttlswp El " er witch o " tuIfM"te s W% 1"S inputs to outputs in 0 4,' 774 627-387E0--- * * alternative ways. For Istrpl 151i42 Itr Ip3 Strip, sot I ] example, maplOxlO scales and switches 10 r [sccoi".-.inputs to 10 outputs, o-,-,es: O=$AatwOh,,..q4 while maplOx20 af- e, ] p,,, 0 -0In 1562 fords 20 outpus. 7 (9 )50333. DESTINATIONS - Destinations directly control sound in some Figure 4. sysLorenzColl.4, showing portion of map and strip way. For example, dstClusters generates chart data. a stream of pulses via MIDI note-on and note-off messages. If clus%on is less than 100, clusters of pulses will be generated, with control of the cluster envelope shape. This provides up to 9 parameters which can be parametrically changed. MIDI continuois controllers are used by the tools dstSampleCell and dstPCM70, which are optimized for particular commercially available hardware. In addition, there are two other types of tools: src filters and map filters. These tools modify, extrapolate from, or change in some way the flow of data out of a source or map. For example, srcFiltMinMax continually monitors the range of incoming data, and sends out min and max messages if the range changes. This is useful if the data range from the src is indeterminate. Recording and playback of source data is provided by srcFiltRecord, and mapFiltCwave generates a control wave with parametric control over wave shape, period, amplitude, and resolution. This may be used for parameter nesting, eg. frequency. Internal APE Protocol The internal APE protocol is the means by which tools communicate their ranges and labels to maps, transmit scaled and unscaled values, and send macro commands such as start, stop, and select preset. We have developed several utilities to wrap the audification system's internal protocol around arbitraiy Max objects or patchers. Associated with each tool is one protoStatus module and one protoData module per inlet or outiet. The use of a consistent set of protocol-handling objects ensures rapid development of new tools. A useful strip chart tool was developed using CNMAT's Multi-Slider object in under an hour. This tool integrated ICMC 218

Page  219 ï~~seamlessly with existing systems we had developed. Conceivably, any computer music piece written in Max could be converted to a dst tool by wrapping our protocol objects around the Max patcher. APE External Protocol We envision our Audification Programming Environment being used within a heterogenous computing environment. To facilitate real-time data communication, we developed a simple data exchange protocol. Although the protocol uses MIDI system exclusive commands, APE will respond to data either through the Apple MIDI Manager or via lower speed serial communication. It is thus possible for us to use audification systems with data generated on any computer connected via a computer modem or null-modem cable. Remote applications can communicate their ranges, labels, and unscaled values as well as macro commands such as boot, start, stop, select preset, and short text messages. Data values are signed, 16-bit integer quantities. The tool srcSysEx serves as a conduit for the remote data. We have used EarLevel Engineering's HyperMIDI XCMDs to successfully integrate audification techniques with StellaStack simulations running under HyperCard. In addition, we have developed a number of general purpose functions in C to facilitate real-time transfer of data into our audification systems. The APE external protocol has the following format, with designated bytes constant: (startSysExByte = 240) (ManufacturerlDByte= 31) (DestinationByte) (CommandByte) (QualifierByte) (4 or 8 dataBytes) (endSysExByte= 247) Commands are directed to individual tools within an audification system via its destination number. audControls is destination 0, and srcSysEx are destinations 1 and 2. Maps are often destinations 3 and 4, with dst tools destinations 5 through 10. Some systems may audify 10 variables via one srcSysEx, while others may audify 20 via 2 of them. The following table details the APE sysEx protocol. DESTINATION COMMAND QUALIFIER DATA audControls,0 requestAck(100) --none-- --none-- boot(101) --none-- --none-- any, 0-10 start(102) --none-- --none-- stop(103) --none-- --none-- preset(104) (1-32) --none-- srcSysEx,1-2 title(105) 0 8 ascii bytes label(106) 1-10 8 ascii bytes min(107) 1-10 4 bytes max(108) 1-10 4 bytes value(109) 1-10 4 bytes default(110) 1-10 4 bytes textMessage(113) 0 or 1 8 characters Data values are encoded as signed, 16 bit quantities, transmitted in 4 4-bit nibbles, most significant nibble first. Text strings are limited to 8 bytes, no spaces. In general, the qualifier denotes the variable number within the srcSysEx tool. The textMessage command allows for operators to send short messages to each other. This is particularly useful when the computers are in remote locations. If the qualifier is 0, the message window is cleared first, while if JCMC 219

Page  220 ï~~the qualifier is 1 it is appended to the current message. Importing Data Files A data file is a simple, efficient way to listen to data generated by a numerical model on another computer or application. The tool srcColl reads a file into memory, and transmits the data at various rates. In addition, a portion of the data file may be looped for closer examination. Creating a file for srcColl is a two-step process. First, an input file is created with the following format:Line 1 - title, up to 8 characters; Line 2 - number of dimensions, up to 10; Line 3 - labels for each dimension, up to 8 characters each; Line 4 through end - data points, floating point format. We have worked with up to 3000 10-D points. Second, the file is converted into a coil file via the program collCVTstdio. This program is written in C and should be able to run on any computer, once recompiled. Future Plans We are developing DSP code for the Digidesign AudioMedia card, with which we hope to gain more sonic parameters via software synthesis. In addition, a variety of graphic options to display data concurrently in the visual and aural domains are in development. Audification tools are being prepared for a project directed by Dr. Mayer-Kress entitled EarthStation, an interactive visual and auditory eco-system display to be premiered at the Ars Electronica Festival in Linz, Austria. Additionally, we are preparing to use the Moog/Kramer multidimensional touch sensitive keyboard'3 to navigate large data fields. A more portable, non-Max, C + + audification system is also under development. Additional Issues That music can convey deep and subtle emotion via abstract sound is self-evident. With audification we are not trying to convey that emotion. We would like to take the sensory and sub-cognitive capacities that make the depth of musical understanding possible and employ them in data exploration. Thus audification may provide very deep and subtle qualities for use in attaining scientific understanding. In a related development, a number of composers are using mathematically generated complexity to create compositional forms and/or synthesize sounds. The works of Truax14, Chareyron15, Pressing'6 and many others can be cited as examples. The similarities and differences between audification and composition as regards imbedding information in an audio stream for subsequent extraction are fascinating. What is the relationship between cognitive and subcognitive information extraction in audification and musical composition? What role does aesthetics play in scientific display and interpretation? Is emotion an important distinguishing element? Another interesting issue raised by audification is its use to encourage mental synthesis. We have been working with Apple Computer to apply audification to education, enabling students to create rich mental models. This will be the topic of a future paper. The capacity to hear certain sounds within one's mind and to then mentally transform those sounds, audiation1, is also very relevant to audification. Many of the inspirations as to how to manipulate the sound and some of the decisions as to what sound manipulation techniques to actually implement will be dependent upon the ability to 'preview' the sounds or the parameter manipulations in one's mind. ICMC 220

Page  221 ï~~Even more important than the capacity to hear and transform sounds in the mind will be the extent to which a scientist can "re-audiate" (mentally hear) data (sound) transformations they have already experienced when running their models. This capacity, dynamic sound memory (DSM)', may be developed as the data analyst gains more experience with audification. DSM will support the data analysts's ability to reflectively consider the results of his/her work. Directly related to DSM is the ability to hear specific model states and by transforming the data/sound in their minds do 'what ifs' on those states. We are not aware of any research done in this area. Another major topic is the applications.of audification. Briefly, these might include understanding complex computer models such as ecosystems, immune systems and economic or disarmament scenarios. There are also applications to work in adaptive computation (artificial life), virtual reality, and the monitoring of complex laboratory, medical or industrial processes. An audification workshop is being planned for people working actively in the field, at the Santa Fe Institute in Fall of 1992. Acknowledgements: The author's gratefully acknowledge the support of Apple Computer's ACOT group for their support, the Santa Fe Institute, R. Jonathan Kramer and Dr. Gottfried Mayer-Kress. REFERENCES 1. G. Kramer, Audification: Using Sound to Understand Complex Systems and Navigate Large Data Sets, Proceedings of the Santa Fe Institute Science Board, Santa Fe Institute, 1991. 2. I. Pollack and L. Ficks, Information of Elementary Multidimensional Auditory Displays, Journal of the Acoustical Society of America, Vol. 26, Number 2, pp. 155-158, March 1954. 3. There is much material published on multi-dimensional visualization. Readers may refer to the SPIE proceedings cited in (8) below for some recent work. 4. For an excellent consideration of auditory pattern recognition and auditory scene analysis, se A. Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound, MIT Press, Cambridge, MA. 5. See note 2. 6. E. S. Yeung, Pattern Recognition by Audio Representation of/Multivariate Analytical Data, Analytical Chemistry, vol. 52, pp. 1120-1123, 1980. 7. S. Bly, Sound and Computer Information Presentation, Unpublished dissertation, U. of California, Davis, 1982. 8. S. Frysinger, Applied Research in Auditory Data Representation, in Proceedings of the SPIE, E.J. Farrell, Ed; Vol. 1259, pp. 130-139, Bellingham, WA, 1990. 9. The timbre space concepts discussed by J.C. Risset and D.L. Wessel in Exploration of Timbre by Analysis and Synthesis, in The Psychology of Music, D. Deutsch, Ed., Academic Press, New York, pp. 47-49, 1982. may provide a basis from which to consider these additional dimensions. The work by John Grey, Multidimensional Perceptual Scaling of Musical Timbres, J. Acoustical Society of America, Vol. 61, No. 5, pp. 1270-1277, 1977. and more recent works published in the Computer Music Journal and proceedings of the ICMC may also provide valuable insights into the use of the many dimensions of timbre. 10. The product development work of Beth Wenczel (NASA-Ames), Scott Fisher (Convolvotron) and Bo Gehring in this area may be of interest. 11. E. Lorenz, Attractor Sets and Quasi-Geostrophic Equilibrium, J. Atmospheric Science, #37, pp. 1685-1699, equations 33-35, 1980. 12. Max is an iconographic object-oriented programming system developed at IRCAM and enhanced and distributed by Opcode, Inc., Menlo Park, CA. 13. G. Kramer, R. Moog, and A. Peevers, The Hybrid: A Music Performance System, Proceedings of the ICMC, Columbus, OH, 1989. 14. B. Truax: Chaotic Nonlinear Systems and Digital Sound Synthesis: An Exploratory Study, Proceedings of the ICMC, Glasgow, 1990. 15. J. Chareyron, Digital Synthesis of Self-modifying Waveforms by Means of Linear Automata, Computer Music Journal, S. Pope, Ed., vol. 14, #4, MIT Press, 1990. 16. J. Pressing, Nonlinear Maps as Generators of Musical Design, Computer Music Journal, C. Roads, Ed., vol. 12, #2, MIT Press, 1988. ICMC 221