APPLICATION OF IMAGE SONIFICATION METHODS TO MUSIC

Yeo, Woon Seung; Berger, Jonathan

APPLICATION OF IMAGE SONIFICATION METHODS TO MUSIC Woon Seung Yeo and Jonathan Berger Stanford University Center for Computer Research in Music and Acoustics woony@ccrma. stanford. edu ABSTRACT To utilize visual information for musical purpose, inevitable time-based nature of sound should be understood and considered. Time is the principle dimension within which all other auditory parameters are placed, and this poses a particular challenge to effective sonification of time independent images and their applications to music. In this paper we present two concepts of time mapping, scanning and probing, to provide a framework for conceptualizing mappings of static data to the time domain. We then consider the geometric characteristics of images to define meaningful references in time. Finally we discuss combination of scanning and probing methods in relation to human image perception model, and proceed to suggest its musical applications and implementation with SonART. 1. INTRODUCTION Due to its inevitable time dependence, data representation in the auditory domain should involve the problem of organization over time. In the area of auditory display, time is not just a parameter, but the principle dimension within which all other auditory parameters are placed. Although this hardly becomes an issue for time ordered information, it plays a crucial role in the problem of sonification of static data such as still images, which are neither organized in time nor containing any time-relevant information inside. Therefore, it requires a mapping to time that is not arbitrarily oriented towards a left-right scan. So far, however, the role of time as the principle in the auditory display has not been paid enough attention compared to its significance for designing and analyzing methods of image sonification. Even the most widely used and effective methods such as inverse spectrogram mapping have not been categorized in terms of time mapping. Another important factor for image sonification is its geometric characteristics. A still image is defined on a two-dimensional space, and each of its pixels can have three (RGB) or four (RGBA, or CMYK) different color values. This multi-layered two-dimensional property makes it possible to define different types of reference pointers by which current dataset for sonification is located. In this paper we present two major concepts of time mapping for image sonification, scanning and probing, to construct a meaningful mapping of static data to time domain. Together with this concept of time, we also suggest the idea of pointers as an essential components of time reference based on the geometric framework of images, and discuss the problem of defining their paths. Examples of mappings are also presented. These concepts of time and geometry provide us a soild theoretical background for understanding the mappings of image sonification. Furthermore, they suggest the musical perspective of image sonification: scanning and probing methods can be understood by anaolgy with two different playing styles of musical performances - strictly following scores vs. freely improvising. We discuss this in 4.2 together with the issue of human perception of images, and present SonART[1] - a sonification framework which offers a number of powerful features for processing multilayered image data. 2. CLASSIFICATION OF IMAGE SONIFICATION MAPPINGS For categorization of image sonification mappings, both time and the geometric reference to time play crucial roles. 2.1. Construction of time: scanning vs. probing Methods for organizaing time-independent data for auditory display can be generally classified into two major categories: whether they are pre-scheduled and fixed, or arbitrary and freely adjustable. 2.1.1. Scanning The term scanning refers to the case in which data is scheduled to be sonified in a fixed, non-modifiable order. Figure 1 illustrates the mapping of inverse spectrogram method, which is by far the most popular method for sonifying images. We refer to this as the inverse spectrogram in that the sonification is analogous to the reconstruction of a sound from its spectrogram. The speed of scanning is usually fixed, and not allowed to be changed arbitrarily throughout the process of sonification. More detailed description of inverse spectrogram mapping is in 4.1. Scanning, however, is not necessarily to be performed along a continuous path. Furthermore, it does not have to cover the whole image area. This issue of path for scanning is discussed in 2.2.2