Page  00000001 INTEGRATION OF SOUND AND IMAGE IN TWO WORKS FOR PIANO AND COMPUTER Dr. David Kim-Boyle University of Maryland, Baltimore County Department of Music 1000 Hilltop Circle, Baltimore, MD 21250, U.S.A. ABSTRACT The author describes aesthetic and practical considerations involved in the integration of sound and image in two recent compositions for piano and computer. In the first of these works, Shimmer (2004), video imagery is derived from the work's sonic materials while in the second, Canon (2005), visual canons are generated from the real-time movement of the pianist's hands. The author describes each of these works and outlines how these considerations have been addressed. 1. INTRODUCTION The structural integration of sound and image for artistic expression has been of interest to artists for many years. [1] With the fairly recent development of real-time audio and video processing softwares, however, artists have the technical facility, perhaps more than ever before, to more fully explore the potential of this area. Jitter, one such video software, has played a significant role in two of the author's recent compositions for piano and computer. Jitter's seamless integration within the realtime MaxMSP audio programming language [2] gives it an immediate appeal on a conceptual if not a technical and practical level. While the strategies employed in the application of Jitter in each of the aforementioned works are different, many of the practical considerations and aesthetic questions faced in their development were common to both. These included questions of coherence, unity and representation in regard to the latter, and performance concerns in regard to the former. The author will describe each of these works in turn and outline how these various issues have been addressed. 2. SHIMMER Shimmer, premiered at the 2004 International Computer Music Conference, is a recent work for piano, resonant glasses and MaxMSP/Jitter. During a performance, the sounds from the piano, amplified through resonant glasses placed within the instrument, and the computergenerated sounds are sent to an offstage loudspeaker sheathed in thin plastic. These sounds generate wave like ripples and shimmers through a small quantity of milk poured into the speaker cone. Milk is used as a propagational medium simply because as a white liquid it contains the full color spectrum and therefore has more potential for visual processing. The overall process recalls somewhat Ben Manley's realization of Lucier's "The Queen of the South" [3] and to a lesser extent some of the work of Carsten Nicolai. [4] The resonant tones emphasized by the glasses and the various audio processing techniques employed create visually interesting interference patterns in the milk. The process is a delicate one, however, as the milk can percolate and bubble unpredictably if the loudspeaker is driven too hard. At certain loud points in the piece this is often unavoidable. An example of both of these periods is illustrated in Figure 1. Figure 1. Interference patterns (left), Bubbling (right). A small digital camcorder is suspended directly above the milk with its zoom adjusted such that the milk just touches the wide edges of the aperture. The resultant video signal is processed in real-time in the Jitter environment in ways analogous to many of the audio processing techniques utilized in the piece before being sent to a pair of video monitors on stage. Much of the video imagery displayed in Shimmer is derived from the areas where the edges of the milk meet the loudspeaker. Various rotational, zooming, blurring and color transforms are employed which parallel, conceptually, much of the spectral processing taking place in the audio domain. Some excerpts from this imagery are illustrated in Figure 2 with the implicit shimmering qualities of each left to the imagination of the reader.

Page  00000002 timbral transformations that occur in the audio domain, moving the imagery of the pianist's hands beyond the boundaries of recognition. Three visual excerpts from the result of these processes are shown in order of occurrence in Figure 3. Figure 2. Video excerpts from the Shimmer Jitter patch. While the Jitter processing employed in Shimmer is relatively straightforward, the demands it places on the CPU are high. Despite the use of the fastest available processors, a dedicated CPU, and various powerful video codecs this high usage can adversely affect the frame rate of the displayed images. During particularly intensive sections of the piece, the frame rate can drop to around 7 fps on a G5/1.5GHz processor. On a G4 processor the rate is even more problematic, dropping to 3 or 4 fps. For now, other than utilize a potentially more efficient video processing software such as Rokeby's VNS system [5], there seems to be little that can be done to address this problem. 3. CANON Canon (2005), premiered at the 2005 Electronic Music Festival in Basel, is the most recent work of the author's to employ real-time image generation. This work adopts a different strategy to the process, however, than Shimmer in that the movements of the pianists hand's provide the source materials for the visual processing. Like Shimmer, however, this visual material is transformed in various ways that parallel the musical and audio processing techniques applied in the work. As the title suggests, canonic structures play a musically important role in Canon. In the audio domain, the canonic materials performed by the pianist are recorded in real-time by the computer and played back such that they create further layers of canonic voices. As the density of these voices increases, their timbral transformation becomes more pronounced as they are eventually reduced to their resonant tone components. The musical structure of the work is reinforced by various parallel visual canons. These employ Electrotap's video delay object tap.jit.delay object [6], and several other split screen techniques using standard Jitter objects. The object, a rectangular resampler, and other color transformations parallel the Figure 3. Video excerpts from Canon. Like Shimmer, the CPU usage incurred from running real-time video processing is intensive, even with separate computers dedicated to the audio and video processing. Again, in the short term, there seems little that can be done to address this problem. Other more mundane practical considerations have involved the optimum placement of the camcorder which needs to be positioned just behind the pianist's right shoulder without inhibiting the natural movement of the pianist's body, and the satisfactory lighting of the keyboard such that reflections and shadowing are minimized. 4. AESTHETIC CONSIDERATIONS Other than the resolution of practical issues already noted, the successful structural integration of sound and image in Shimmer and Canon involved the precompositional consideration of many fundamental aesthetic issues. Key amongst these were questions of unity - should the sonic and visual techniques explore the same artistic concerns or should they reside in independent aesthetic domains, is one an extension of the other, and to what extent are the processes employed in one medium translatable or able to be mapped to the other? [7] In Shimmer the sonic and visual processes are aesthetically unified at a very primitive level in that they are concerned with transformations. Visually, the shimmering milk becomes the primitive object which is transformed while musically, pre-existing works take on this role. In Shimmer the materials performed by the pianist, have been drawn from sonorities in Morton Feldman's 1987 piano work Triadic Memories. [8] Both sonically and visually, the materials themselves become less significant than the resonances one is able to draw from them. The process of their transformation becomes, in essence, the focus of aesthetic interest of the work. In Shimmer, the sonic and visual materials are also unified

Page  00000003 in that the source materials for the visual processing are fundamentally dependant on the work's sonic materials. For the author, such a relationship between the sonic and visual source materials was of fundamental importance. [1] In Canon the process of transformation also defines the form of the work with the unification between the musical and visual materials formalized through various canonic structures. This work goes beyond Shimmer, however, in making the performer's body an explicit part of the representations. Rather than using a pre-existing work as the basis for the musical transformations that take place, in Canon a simple descending C major scale heard in various permutations becomes the musical object to be transformed. Transformational processes employed in the sonic domain often have visual analogies, at least at a broad metaphorical or conceptual level. [9[[10][11] Spectral processing is akin to color transformation, reverberation finds its counterpart in visual blurring, and visual magnification is like filtering for resonant tones. Unfortunately, it is also often the case that an interesting transformational technique in one medium is not altogether satisfying when applied to the other although this is sometimes a result of the mapping techniques applied. For example, the visual realization of selfsimilar or chaotic algorithms can be strikingly beautiful but when the algorithms are mapped to musical parameters such as pitch the results can be musically crude. Similarly, the mapping of audio spectrums to visual images is not particularly sophisticated. Clearly, the psychological processes involved in sonic and visual perception are quite different and it is not always the case that musically interesting processes have visual counterparts and vice versa. The temporal unfolding of sonic and visual transformations is another area of particular concern. [12] When conceptually similar transformational techniques are applied simultaneously in two mediums, their intrinsic aesthetic interest can be weakened - the whole is not always greater than the sum of its parts. For example, it was found that the temporal dislocation of visual and musical canons in Canon was of greater aesthetic interest than a process where visual transformations were always synchronized to their musical counterparts. Ultimately, the aesthetically successful translation or mapping of transformations across mediums becomes one of order. Higher order global processes seem more likely to successfully translate than lower level local processes. This was the approach adopted in both Canon and Shimmer where translatable techniques have included those listed earlier - blurring/reverberation, spectral processing/color transformation, magnification/filtering. While many of these techniques are employed simultaneously, at a local level their evolution is more temporally independent. Aesthetically this creates unity at higher levels while granting the material freedom to develop and explore its own unique potentials at lower local levels of order. Finally, it is also interesting to speculate on how the practical limitations that are faced condition the transformational processes able to be employed. In the case of Canon and Shimmer there was a very definite limitation on the extent to which ideas could be explored in real-time in the visual domain because of CPU processor limitations. 5. FUTURE WORK The generation of engaging visual effects from sonic processes continues to be of artistic interest and the author is exploring them in ongoing work. Of particular interest are relationships between spectral processing and color transformations and also between reverberation and blurring. These are being explored in a work-inprogress for flute and MaxMSP/Jitter. 6. REFERENCES [1] Ciufo, T. "Real-Time Sound/Image Manipulation and Mapping in a Performance Setting," in Proceedings of the Fourth International Digital Arts and Culture Conference, Providence, RI: 2001. Also available at < media/sound_image.pdf>. [2] Zicarelli, D. "An Extensible Real-Time Signal Processing Environment for Max," in Proceedings of the 1998 International Computer Music Conference, Ann Arbor, MI: International Computer Music Association, pp. 463-466, 1998. [3] Manley, B. <>. July 1998. [4] Nicolai, C. <>. May 2005. [5] Rokeby, D. < 1>. [6] Electrotap.<>. November 2004. [7] Feldman, M. Triadic Memories. London: Universal Edition, 1987. [8] Gerhard, D., D. H. Hepting and M. McKague. "Exploration of the Correspondence between Visual and Acoustic Parameter Spaces," in Proceedings of the 2004 Conference on New Interfaces for Musical Expression (NIME04), Hamamatsu, Japan, pp. 96-99, 2004. [9] Sedes, A., B. Courribet and J.-B. Thiebaut. "From the Visualization of Sound to Real-time

Page  00000004 Sonification: Different Prototypes in the Max/MSP/Jitter Environment," in Proceedings of the 2004 International Computer Music Conference, Miami: International Computer Music Association, 2004. [10] Courribet, B. "R6flexions sur les relations musique/vid6o et strat6gies de mapping pour Max/MSP/Jitter," in Proceedings of the 12e Journees d'Informatique Musical Conference. Universit6 de Paris VIII and MSH Paris Nord, pp. 165-169, 2005. [ 11] Levin, G. Painterly Interfaces for Audiovisual Performance. MIT Masters Thesis, 2000. [12] Kapuscinski, J. "Compositions with Sounds and Images." Available at <>. Fall 2001.