Contour Hierarchies, Tied Parameters, Sound Models and
Music
Lonce Wyse
Institute for Infocomm Research, Singapore
lonce)zwhome.orj, www.zwhome.org/~-once
Abstract
Our goal is to construct sound generating model
structures that capture relationships among sound
components that are perceived by human listeners as
musical when the model parameter space is
traversed. Designing constructs that are in certain
ways homologous to perceptual organization, results
in sound model structures that are not only possible
to exploit for "expressive "performance, but that can
play a direct role in the compositional listening
strategies for an audience. A case study of a model of
a Canyon Wren song is used to illustrate the
modeling principles.
1 Sound model design
A generative sound model has three components:
a) a range of sounds, b) a set of exposed parameters,
c) behavior that specifies how the space of sounds is
traversed under parameter manipulation. The process
of sound model design frequently starts with a
specification of the range of sounds and some
constraints on the behavior. Sometimes a composer
may have actual samples lying within the target range
of sounds, but needs a model in order to generate a
broader class of sounds that includes those in the
specification as a special case, and/or because of a
need for effective "handles" for moving around the
space of sounds.
The hierarchical structures, contours, parameter
mappings and tyings, and relationships among sound
components embodied in an algorithm provide the
definition and character of a virtual sound source that
the listener can use in their musical listening
strategies. For example, knowing the range of sounds
and behaviors of a model sets the conditions for
expectations and "surprise" that have been so much
the topic of literature on musical listening (Meyer,
1956). Once a listener is familiar with the sound
models that are being used in a composition, they can
be used in the cognitive organization of unfolding
sonic material. This is particularly important in
electroacoustic music where a shared body of
knowledge about harmony and melody are not
available to help in organizing the listening
experience.
In physical sound modeling (Smith 2002, Cook
2002a), the structural constraints are taken from the
properties of materials such as string, tubes and
plates. Physical models generally expose parameters
that are intuitive, easy to learn how to control, and
whose effects on the sound are easy to perceive given
the shared knowledge that we all have about the
physical world.
It is not only that the constraints are physical that
make these models work. Indeed, it is commonly
noted that with physical models we can do things that
would be impossible in the real world, such as putting
a vibrato on material thickness. Thus it is not the
physical plausibility per se of these models that make
them so intuitive and valuable in a musical context. It
is the very fact that there are constraints and structure
in the model that gives us the impression of a well
defined sound source, even if a real physical source
cannot be identified as the sound generator.
With acoustic modeling, we don't have, or don't
use, pre-made structure, but use only the sound as a
guide to model structure. There are theoretically an
infinite number of model architectures capable of
generating a given set of sounds (though in practice,
it may be difficult to find even one). The challenge is
to find structure - relationships between component
features that give clear character and definition to a
perceived sound source, even without having to hear
the whole range of sounds it can make. If models are
"strong" in this sense, then it should be easy to tell,
for example, whether a given sound comes from a
certain model.
There are several ways that we can build structure
"behind" the surface representation of a sound. One
way is to analyze a sound into a set of dynamic
acoustic (e.g. spectral) features, and then attempt to
reduce the redundancy in our representation using
some variant of Principle or Independent
Components Analysis. One of the objectives of this
approach is to come up with a small number of
parameters that represent a sound example and a class
of sounds in a "neighborhood" of the target sound.
Another goal is to discover a low dimensional set of
parameters that are "perceptually significant".
We cannot expect such automatic redundancy
discovery to always find the structure that we so
easily hear in a sound. The following example
Proceedings ICMC 2004