~Proceedings ICMCISMCI2014 14-20 September 2014, Athens, Greece
The SpatDIF library - Concepts and Practical Applications in Audio Software
Jan C. Schacher
Zurich University of the Arts
Institute for Computer Music
and Sound Technology ICST
jan. schacher@zhdk. ch
Chikashi Miyama
University of Music, Cologne
Studio for Electronic Music
me@chikashi. net
Trond Lossius
Bergen Center for Electronic Arts BEK
trond. lossius@bek.no
ABSTRACT
The development of SpatDIF, the Spatial Sound Description Interchange Format, continues with the implementation of concrete software tools. In order to make SpatDIF
usable in audio workflows, two types of code implementations are developed. The first is the C/C++ software library 'libspatdif', whose purpose is to provide a reference
implementation of SpatDIF. The class structure of this library and its main components embodies the principles derived from the concepts and specification of SpatDIF. The
second type of tool are specific implementations in audio
programming environments, which demonstrate the methods and best-use practices for working with SpatDIF. Two
practical scenarios demonstrates the use of an external in
MaxMSP and Pure Data as well as the implementation of
the same example in a C++ environment. A short-term
goal is the complete implementation of the existing specification within the library. A long-term perspective is to
develop additional extensions that will further increase the
utility of the SpatDIF format.
1 Introduction
The Spatial Sound Description Interchange Format (SpatDIF) presents a structured syntax for describing spatial audio information, addressing the different tasks involved in
creating and performing spatial sound scenes. The goal of
this approach is to simplify and enhance the methods of
creating spatial sound content and to enable the exchange
of scene descriptions between otherwise incompatible software. SpatDIF proposes a simple and extensible format
as well as best-practice examples for storing and transmitting information about spatial sound scenes. It encourages portability and the exchange of compositions between
venues with different surround audio infrastructures. SpatDIF also fosters collaboration between artists such as composers, musicians, sound installation artists as well as researchers in the fields of acoustics, musicology, and sound
engineering. SpatDIF was developed in a collaborative effort and has evolved over a number of years.
The completion of a first usable version of the specification [9] defining the core descriptors and a few indisCopyright: 2014 Jan C. Schacher et al. This is an open-access article distributed
under the terms of the u n n, which
permits unrestricted use, distribution, and reproduction in any medium, provided
the original author and source are credited.
pensable additional descriptors was achieved in 2012 and
is published in the Computer Music Journal [8]. The community pages as well as all the related information can be
found at: http: / /www. spatdif. org. 1
The SpatDIF specification was informally presented to
the spatial sound community at the ICMC in Huddersfield
in August 2011 and at a workshop at the TU-Berlin in
September 2011. The responses in these meetings suggested the urgent need for a lightweight and easy to implement spatial sound scene standard, which could contrast the complex MPEG-4 scene description specification
[12]. In addition, several functions necessary to make this
lightweight standard become functional, such as the capability of dealing with temporal interpolation of scene descriptors as described, were introduced. For a complete
overview of the state-of-the art in audio spatialisation tools,
please refer to the 2013 article in Computer Music Journal
[8], which also functions as a sort of white paper for the
specifications 0.3 [9].
Since then, one mayor development in surround audio workflows has been the introduction of the proprietary Dolby Atmos format, which mixes concepts such
as sound-beds and channel-based traditional panning with
object based real-time panning. Dolby Atmos authoring is achieved using ProTools and the Dolby Rendering
and Mastering Unit (RMU). RMU provides the rendering
engine for the mix stage, and integrates with Pro Tools
through the Dolby Atmos Panner plug-in over Ethernet for
metadata communication and monitoring. The metadata
is stored in the Pro Tools session as plug-in automation
[2]. Dolby Atmos was initially developed for cinema, and
more recently consumer appliances have been announced
as well.
Finally, one toolset deserves mention because it resembles in many ways what the development process described
in this paper is aiming at. The SoundScape Renderer
by Geier et al. [4] and its XML-based storage format
ASDF [3] were developed in the opposite direction, going
from concrete software-implementations to format definitions. This has as a consequence that some of the ASDFdescriptors are implementation-driven, which makes it less
portable than SpatDIF aspires to be.
1.1 SpatDIF Basics
Since SpatDIF is a syntax rather than a programming interface or file-format, it may be represented in any of the cur1 All URIs in this article were last accessed in April 2014.
- 861 -
0