FOCUS-PLUS-CONTEXT DISPLAYS FOR AUDIO INTERACTION

Gerhard, David; Ellis, Jarrod

PDF
Print
Share+
- Twitter
- Facebook
- Reddit
- Mendeley

FOCUS-PLUS-CONTEXT DISPLAYS FOR AUDIO INTERACTION

Gerhard, David; Ellis, Jarrod

Volume 2007, 2007

Permalink: http://hdl.handle.net/2027/spo.bbp2372.2007.190

Permissions: This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. Please contact mpub-help@umich.edu to use this work in a way not covered by the license.

For more information, read Michigan Publishing's access and usage policy.

- 150% +

image text pdf

FOCUS-PLUS-CONTEXT DISPLAYS FOR AUDIO INTERACTION David Gerhard University of Regina Dept. of Computer Science Dept. of Music Regina, SK Canada gerhard @cs.uregina.ca ABSTRACT We present an audio browsing and editing paradigm that incorporates the "focus plus context" visual interaction concept. A traditional waveform is displayed in full, and an area of focus is dynamically re-calculated to provide maximum detail in-focus and minimum detail in-context. The interaction metaphor also simultaneously re-scales a frequency-domain display, with increased detail available in both time and frequency domains by means of subsampling and window overlap. Various methods for selecting focus, identifying focus, and transitioning between the focus and context display areas are presented, and advantages for typical audio interaction applications are discussed. 1. INTRODUCTION The standard method of interaction for editing digital audio presents a waveform which can be resized to any scale from a single sample or sample-per-pixel representation to a display of the full waveform. Users interacting with such an interface may find that, depending on the work being performed on the waveform, a number of different scales are appropriate. For example, when correcting localized recording errors such as clicks and pops from a vinyl recording, the user may need to zoom in to the sample level. When mixing multiple parts, duplicating, or re-recording sections, however, a larger scale may be required. Regardless of the working scale, for anything longer than a single note or acoustic event, the user loses the context of the work being done when zooming in to a reasonable workable resolution. This is closely related to the problem of interactively navigating large information spaces in a limited context. Most audio interaction software separates the global view of the raw waveform from its local view. This involves multiple separate windows or "panes" to represent a single track of audio data, one for the local work site and one for the context or overview. This multiple-window environment is used in many other applications, and has been critiqued [2, 5]. Perhaps more problematic in the audio interaction realm is the loss of context when working with multiple tracks of audio simultaneously. Most Jarrod Ellis University of Regina Dept. of Computer Science Regina, SK Canada ellisjja@cs.uregina.ca Figure 1. Audio interaction window in Amadeus. A context pane is available, but it is outside of the user's locus of attention, and presented at a different scale with no scale markings. current audio interface programs require the view to be focused at a consistent point across all tracks, effectively locking all tracks together and forcing a user to zoom out to a wider context to jump from one point to another in the project. Several improvements have been made to facilitate this process, including bookmarks and labels, hot-key zooming and complex navigation controls, and some programs even allow a user to be localized at a different point in multiple tracks, but these adaptations are primarily attempts to mitigate the difficulties of working in multiple focus levels in the same document. The user has to mentally assimilate these time-based domains, creating and maintaining a large mental model of the entire project at high cognitive expense. This can be particularly difficult when a project contains several portions that are acoustically similar, as is the case when mastering music with repeating verse-plus-chorus structure. A user may think she is working on chorus 1 when she is in fact working on chorus 3, since the visualizations of both choruses look identical. There is no indication in the user's locus of attention[6] of the overall location of the work-point in the wider piece. Figure 1 shows an audio interface window from the 405 0