Page  00000001 The Sonic Scanner and the Graphonic Interface Dan Overholt STEIM / CREATE Studio for Electro-Instrumental Music - Amsterdam, The Netherlands Center for Research in Electronic Art Technology - U.C. Santa Barbara Abstract This paper describes the concepts, design, implementation, and evaluation of two new interfaces for music performance and composition. Both of the interfaces were motivated by the idea of creating music through drawing, but they approach the activity in very different ways. While the Graphonic Interface allows you to make music as you are drawing, the Sonic Scanner requires pre-composed graphic material in order to make music. However, both of the devices are real-time controllers that produce sound in an interactive manner, thereby allowing them to be used as performance instruments. 1 Introduction The transformation of drawings into sound is the inspiration behind this work; despite the fact that this is not a new idea, the implementation of these devices provides musical interfaces that have never before been available. What follows is an introduction to the instruments themselves, the hardware and software systems developed during their creation, and the mapping of musical parameters onto their performance interfaces. The Sonic Scanner (see Figure 1) is an instrument that turns pre-existing drawings, pictures, etc. into sound. It uses an antiquated handheld scanner as the interface, with the electronics modified to turn the visual scan-line into an audio waveform. With it you can effectively hear whatever you decide to scan transformed into sound-the resulting audio just depends on where you put the scanner. Besides translating graphics into audio waveforms, there are several other modes which allow you to manipulate audio in different ways with the Sonic Scanner. These are described in section 4.1. In addition to the scan-line, there are some extra control sources on top of the device (pressure sensors) that can be mapped to various parameters in the audio synthesis algorithms. Figure 2. The Graphonic Interface is an interactive surface for music performance and composition. The Graphonic Interface (see Figure 2) is an instrument that captures the gestures of drawing and translates them into sound as you sketch. It uses a commercial system called a whiteboard tracker that converts input from pen movement to real-time digital data, which is then analyzed by the computer to produce music corresponding to your gestures. Instead of using a whiteboard as the drawing surface, the Sigure i. Tne Sonic scanner is a new instrument ior music performance that turns drawings, pictures, etc. into sound. Proceedings ICMC 2004

Page  00000002 Graphonic Interface employs a large plate of plexiglass as the performance interface. This plexiglass plate allows the performer(s) to stand behind the instrument and face the audience while playing, and also functions as a flat-panel loudspeaker membrane. Thus, the music that is generated from your gestures is amplified and played back through the instrument itself, providing tactile feedback to the performer and effectively turning the Graphonic Interface into a pseudo-acoustic musical instrument. 2 Background and Motivations As I previously mentioned, the idea of graphical manipulation of sound is not new-there are several examples of prior instruments, of which I will highlight two here. The first is the UPIC system (Unite Polyagogique Informatique du CEMAMu), invented by lannis Xenakis at the Centre d'Etudes de Mathematique et Automatique Musicales in Paris in 1977 (see Figure 3 left). of this work. The first motivation was to create a musical instrument for real-time performance in a concert setting, while another idea was to design an installation where the Graphonic Interface would be set up as an exhibit allowing users to create music themselves. The Graphonic Interface can of course do both of these things, and to some degree it just depends on the space in which it is set up. The Canadian inventor and composer Hugh LeCaine took a different approach to translating graphics into sound. In the late 1950s he created an instrument called the Spectrogram (see Figure 3 right), consisting of a bank of 108 analog sinewave oscillators controlled by curves drawn on long rolls of paper. The paper was fed into the instrument, which sensed the curves and used them to determine the frequency and volume of each oscillator. It was essentially an analog additive synthesis instrument that used pre-drawn graphics to drive the sound. The Sonic Scanner can do more or less the same thing, but the additive synthesis algorithm is only one of its modes. Because of its other modes, the Sonic Scanner is more versatile than the Spectrogram, and it is also more flexible in that it does not use a sheet-feeder for the paper but lets the performer move around a space and scan any image or object. This capability was one of the motivations behind the Sonic Scanner. It was envisioned as a modified hand-held scanner, and although there were some initial experiments with a flatbed scanner, the original concept worked best. The fact that the ergonomics of the handheld scanner were already good was a benefit, and the addition of a battery and some extra circuitry made the whole system wireless, giving the Sonic Scanner the element of being completely mobile. 3 The Graphonic Interface Most electronic instruments today make use of loudspeakers in order to produce sound waves, thereby separating the performer's interface from the sound source. Traditional acoustic instruments, on the other hand, act as both the interface and speaker combined into one unit; as the sound comes from the instrument itself, the player is in direct physical contact with the vibrations being produced. In order to capture this feeling in the Graphonic Interface, I decided to use a tactile sound transducer attached to the plexiglass, which provides touch feedback to the performer and turns the plexiglass plate into a flat panel speaker (you draw on the speaker). In this way, the system is a haptic device that reacts to a performer by responding through the same surface with which they interact. The first version of the Graphonic Interface uses a clear piece of plexiglass, but the use of a semi-transparent (frosted) plexiglass plate on a future prototype would permit video projection onto the performance surface (enabling more interactive mappings), and/or allowing the silhouette of the performer to be shadow-cast onto the interface with back-lighting from a theatrical stage light. Figure 3. The UPIC system by lannis Xenakis, and Hugh LeCaine's Spectrogram machine. The original UPIC used a commercial digitizer table (basically a large graphics tablet, 75x60cm) as an input device on which the user designed their music. It seems that the system was focused more on composition than performance, as the lines and shapes you drew were interpreted as controls for music to be rendered later-there were 4 steps: 1. draw waves 2. draw envelopes 3. compose a page 4. mixage. But you had to press "calculate page" in order to hear your sound, so it was not intended as a performance interface. However, the UPIC system did in fact transform graphics into sound through the computer, and Xenakis used it to compose Mykenae Alpha in 1978 as well as introducing it to many groups of dancers, kids, computer minded people, non-musicians and other composers over the following years (Roads 1996). While the Graphonic Interface isn't as flexible as the UPIC system was for composing, its emphasis is not on being a compositional tool, but rather a live performance instrument. As such, it produces sound only when you are directly interacting with it. There were two primary motivations behind the development of the Graphonic Interface, and the UPIC system served as a departure point and inspiration for some Proceedings ICMC 2004

Page  00000003 3.1 Design of the Graphonic Interface The Graphonic Interface utilizes a combination of existing products for the realization of the hardware components of the system. The custom aspects of the interface are the construction of the drawing surface (along with the mounting of sensors and actuators), and the software to translate performance gestures into sound. The system uses two computers-one to capture user input and analyze gestures, and one to generate sound. Graphonic Interface Figure 4. Block diagram of the system setup for the Graphonic Interface. Hardware Implementation. The Graphonic Interface uses a large plexiglass plate (acrylic, 183x92cm) as its interactive surface (see Figure 4). The material was chosen partly due to economics, as well as ease of transport since it is less fragile than glass. Although true glass would have made a better transducer of acoustic energy as a flat panel speaker, the resulting auditory response from the plexiglass is reasonably loud, and adds an interesting color to the sound. If necessary (e.g., in a large concert hall), the system can be amplified with either microphone(s) or by taking a direct signal from the audio synthesis computer. The plate is hung from an aluminum pole which is semirigidly connected at several attach points, adding to the rigidity of the surface. The pole is also used to run some of the wires for the sensors through to keep them out of the way while drawing. The whiteboard tracker ultrasound sensors (made by e-Beam) are mounted in the two upper corners and attached with bolts, as is the tactile sound transducer in the lower middle of the interface. The main body of the tactile sound transducer (made by Clark Synthesis) is mounted on the audience side of the surface, so that the performer's only impediment to drawing is a single bolt head. Software Implementation and Mappings. The Graphonic Interface system makes use of one PC and one Macintosh computer. Input data from the whiteboard tracker is displayed, analyzed, and converted to the OSC protocol on the PC in order to send it to the Macintosh. The whiteboard tracker communicates with the PC via a serial cable, and a custom application written in C++ analyzes the data and sends the resulting OSC packets across the network to a Macintosh running the sound synthesis programming language SuperCollider. The raw data coming from the whiteboard tracker includes the status of each pen (either on or off-drawing or not drawing), which pen is in use (there are four pens plus an eraser) and the x,y coordinates of the tip of the pen. This data is sent to SuperCollider after being converted to OSC-the gesture analysis features will be enhanced in future versions of the PC application. There are also many interesting research issues besides gesture analysis, such as automated performance environments, animated musical scores, and networked performance that I plan to explore and document with the ongoing project. The synthesis algorithms currently implemented in SuperCollider are concepts that can be taken further with additional exploration, but they are already quite exemplary and effective for simple performance setups. One of the first algorithms experimented with was an FM synthesis mapping in which the x and y coordinates of the control the frequencies of two oscillators. Another mapping uses the xaxis to control the pitch of a wavetable synthesis algorithm, while the y-axis controls both the cutoff frequency of a lowpass filter, and the amount of a ring-modulation type effect. An interesting use of the Graphonic Interface is demonstrated with an effects algorithm in SuperCollider that processes live audio from a microphone and lets you modulate the parameters of a delay, filter, and reverb as you hear and feel your own voice through the pen interaction. There are also several algorithms that have been developed for the instrument which employ physical modeling and granular synthesis methods. For a performance, it is possible to choose up to five different algorithms/mappings (one for each of the pens plus the eraser), and switch between them simply by using the different pens. 3.2 Evaluation of the Graphonic Interface The whiteboard tracking system has some inherent limitations that hinder the expressiveness of the interaction, but I feel that the Graphonic Interface is a very effective performance device nonetheless. The two main drawbacks of the whiteboard tracker are the inability to track more than one pen at the same time, and the lack of pressure sensitivity for the pens (something I have considered adding with some extra electronics). Even so, the instrument can be quite expressive with the right mappings, and I feel that my choice to focus on a real-time gestural interaction style matches well with the capabilities of the system. I chose not to interpret anything but the actions of drawing with the Graphonic Interface, because I didn't feel it would be compelling in a performance to let someone draw a whole sketch and then hit a button that started a noninteractive process of interpreting their sketch. However, I was still curious about interpreting pre-drawn material-it was this motivation that led to the development of the Sonic Scanner. Proceedings ICMC 2004

Page  00000004 4 The Sonic Scanner The Sonic Scanner takes optical input and changes it into audible output. This process takes place as follows. 4.1 Design of the Sonic Scanner The Sonic Scanner utilizes the enclosure from a handheld page scanner and electronics from a wireless game controller, as well as other components (see Figure 5). The basis of interaction is the visual scan-line which is digitized using internal optics (mirrors/lenses), a wireless video camera (spy-cam type), and a USB video interface. Hardware Implementation. By taking some of the original electronics out of the scanner, I was able to fit the electronics from a wireless game controller inside, and connect it to some of the original components, such as the thumb button, mode switch, and rotary dial. Also connected to the game controller circuit are four pressure sensors, three for the fingers and one for the palm (with an offset adjustable via the rotary dial). Finally, there is a rechargeable battery to make the device completely mobile. Each mapping makes use of the pressure sensors in a different way, appropriate to the synthesis technique. For example, in the Waveform mode, the palm pressure/rotary dial controls the scan rate (pitch), and the three finger sensors control the low, mid, and high frequencies of an FIR filter if you are holding the thumb button. In the Spectrum mode, palm pressure controls the volume, first and third finger sensors control panning, and the middle finger sensor manipulates the cutoff frequency of a resonant low-pass filter. The Rhythm mode uses the rotary dial to set the tempo, and the three finger sensors bring out the rhythms of the component colors (RGB). The thumb button is used to record a new sound to be manipulated in the Sampler mode, while the rotary dial sets the playback speed, and the sample pointer is controlled by the scan-line, allowing a performer to scratch through a sound using the Sonic Scanner. 4.2 Evaluation of the Sonic Scanner The musical expression afforded by the Sonic Scanner can be extremely intriguing because of both the boundless variety of things to scan, and the four modes which provide entirely different results for the same scan-line. The final version integrates foam pads on top of the pressure sensors to enhance ergonomics and playability. It is the author's opinion that the Sonic Scanner is a very effective musical interface that turns any image into a wide variety of sounds. 5 Conclusion The Sonic Scanner and the Graphonic Interface are new musical instruments that transform drawing and its associated gestures into sound. They provide novel interfaces for capturing gestures and controlling parameters of electronic music. The author plans to continue investigating possible improvements to the hardware and software of these instruments, as well as engaging in musical activities and performances with them. Audiovisual examples of the Sonic Scanner and the Graphonic Interface can be found at the web sites listed with the references. 6 Acknowledgments Many thanks to Dr. Curtis Roads, Stephen Pope, Gilroy Menezes, Michel Waisvisz, Daniel Schorno, Frank Balde, Jorgen Brinkman, Rene Wassenburg, and Anne-Marie Skriver for their help and support during these projects. References Clark Synthesis Tactile Sound Transducers: http://www. clarksynthesis. comr/ E-beam White Board Tracking System: http://www. e-beam. com/ Overholt, D. Sonic Scanner and Graphonic Interface webpages: Roads, C. (1996). The Computer Music Tutorial. Cambridge, Massachusetts: MIT Press rigure 3. ne sonic scanner ana its controis. Software Implementation and Mappings. The software used to generate audio from the Sonic Scanner is all done in Max/MSP by means of the relatively new Jitter video processing extension. There are four modes you can use to transform your pictures into sound (controlled by the mode switch). The first mode, Waveform turns the optical scanline directly into sound by literally going through the brightness levels at an audio rate. The second mode, Spectrum translates the optical spectrum into an audio spectrum. It does this by mapping the scan-line onto the frequency domain using a FFT (Fast Fourier Transform). The third mode, Rhythm is similar to the first in that it directly maps the scan-line to audio, but at a much slower rate in order to pull out the rhythmic content of the scanned material. The last mode, Sampler lets you record a sound and then manipulate its playback with the Sonic Scanner. Proceedings ICMC 2004