FOCUS-PLUS-CONTEXT DISPLAYS FOR AUDIO INTERACTION
David Gerhard
University of Regina
Dept. of Computer Science
Dept. of Music
Regina, SK Canada
gerhard @cs.uregina.ca
ABSTRACT
We present an audio browsing and editing paradigm that
incorporates the "focus plus context" visual interaction
concept. A traditional waveform is displayed in full, and
an area of focus is dynamically re-calculated to provide
maximum detail in-focus and minimum detail in-context.
The interaction metaphor also simultaneously re-scales a
frequency-domain display, with increased detail available
in both time and frequency domains by means of subsampling and window overlap. Various methods for selecting focus, identifying focus, and transitioning between
the focus and context display areas are presented, and advantages for typical audio interaction applications are discussed.
1. INTRODUCTION
The standard method of interaction for editing digital audio presents a waveform which can be resized to any scale
from a single sample or sample-per-pixel representation
to a display of the full waveform. Users interacting with
such an interface may find that, depending on the work
being performed on the waveform, a number of different scales are appropriate. For example, when correcting
localized recording errors such as clicks and pops from
a vinyl recording, the user may need to zoom in to the
sample level. When mixing multiple parts, duplicating,
or re-recording sections, however, a larger scale may be
required. Regardless of the working scale, for anything
longer than a single note or acoustic event, the user loses
the context of the work being done when zooming in to a
reasonable workable resolution. This is closely related to
the problem of interactively navigating large information
spaces in a limited context.
Most audio interaction software separates the global
view of the raw waveform from its local view. This involves multiple separate windows or "panes" to represent
a single track of audio data, one for the local work site and
one for the context or overview. This multiple-window
environment is used in many other applications, and has
been critiqued [2, 5]. Perhaps more problematic in the
audio interaction realm is the loss of context when working with multiple tracks of audio simultaneously. Most
Jarrod Ellis
University of Regina
Dept. of Computer Science
Regina, SK Canada
ellisjja@cs.uregina.ca
Figure 1. Audio interaction window in Amadeus. A context pane is available, but it is outside of the user's locus of
attention, and presented at a different scale with no scale
markings.
current audio interface programs require the view to be
focused at a consistent point across all tracks, effectively
locking all tracks together and forcing a user to zoom out
to a wider context to jump from one point to another in the
project. Several improvements have been made to facilitate this process, including bookmarks and labels, hot-key
zooming and complex navigation controls, and some programs even allow a user to be localized at a different point
in multiple tracks, but these adaptations are primarily attempts to mitigate the difficulties of working in multiple
focus levels in the same document. The user has to mentally assimilate these time-based domains, creating and
maintaining a large mental model of the entire project at
high cognitive expense. This can be particularly difficult
when a project contains several portions that are acoustically similar, as is the case when mastering music with
repeating verse-plus-chorus structure. A user may think
she is working on chorus 1 when she is in fact working on
chorus 3, since the visualizations of both choruses look
identical. There is no indication in the user's locus of attention[6] of the overall location of the work-point in the
wider piece.
Figure 1 shows an audio interface window from the
405
0