APPLICATION OF IMAGE SONIFICATION METHODS TO MUSIC
Woon Seung Yeo and Jonathan Berger
Stanford University
Center for Computer Research in Music and Acoustics
woony@ccrma. stanford. edu
ABSTRACT
To utilize visual information for musical purpose, inevitable
time-based nature of sound should be understood and considered. Time is the principle dimension within which all
other auditory parameters are placed, and this poses a particular challenge to effective sonification of time independent images and their applications to music.
In this paper we present two concepts of time mapping,
scanning and probing, to provide a framework for conceptualizing mappings of static data to the time domain.
We then consider the geometric characteristics of images
to define meaningful references in time. Finally we discuss combination of scanning and probing methods in relation to human image perception model, and proceed to
suggest its musical applications and implementation with
SonART.
1. INTRODUCTION
Due to its inevitable time dependence, data representation
in the auditory domain should involve the problem of organization over time. In the area of auditory display, time
is not just a parameter, but the principle dimension within
which all other auditory parameters are placed. Although
this hardly becomes an issue for time ordered information, it plays a crucial role in the problem of sonification
of static data such as still images, which are neither organized in time nor containing any time-relevant information inside. Therefore, it requires a mapping to time that
is not arbitrarily oriented towards a left-right scan. So far,
however, the role of time as the principle in the auditory
display has not been paid enough attention compared to its
significance for designing and analyzing methods of image sonification. Even the most widely used and effective
methods such as inverse spectrogram mapping have not
been categorized in terms of time mapping.
Another important factor for image sonification is its
geometric characteristics. A still image is defined on a
two-dimensional space, and each of its pixels can have
three (RGB) or four (RGBA, or CMYK) different color
values. This multi-layered two-dimensional property makes
it possible to define different types of reference pointers
by which current dataset for sonification is located.
In this paper we present two major concepts of time
mapping for image sonification, scanning and probing, to
construct a meaningful mapping of static data to time domain. Together with this concept of time, we also suggest
the idea of pointers as an essential components of time
reference based on the geometric framework of images,
and discuss the problem of defining their paths. Examples
of mappings are also presented.
These concepts of time and geometry provide us a soild
theoretical background for understanding the mappings of
image sonification. Furthermore, they suggest the musical
perspective of image sonification: scanning and probing
methods can be understood by anaolgy with two different
playing styles of musical performances - strictly following scores vs. freely improvising. We discuss this in 4.2
together with the issue of human perception of images,
and present SonART[1] - a sonification framework which
offers a number of powerful features for processing multilayered image data.
2. CLASSIFICATION OF IMAGE SONIFICATION
MAPPINGS
For categorization of image sonification mappings, both
time and the geometric reference to time play crucial roles.
2.1. Construction of time: scanning vs. probing
Methods for organizaing time-independent data for auditory display can be generally classified into two major categories: whether they are pre-scheduled and fixed, or arbitrary and freely adjustable.
2.1.1. Scanning
The term scanning refers to the case in which data is scheduled to be sonified in a fixed, non-modifiable order. Figure
1 illustrates the mapping of inverse spectrogram method,
which is by far the most popular method for sonifying images. We refer to this as the inverse spectrogram in that the
sonification is analogous to the reconstruction of a sound
from its spectrogram.
The speed of scanning is usually fixed, and not allowed
to be changed arbitrarily throughout the process of sonification. More detailed description of inverse spectrogram
mapping is in 4.1. Scanning, however, is not necessarily
to be performed along a continuous path. Furthermore, it
does not have to cover the whole image area. This issue
of path for scanning is discussed in 2.2.2