The SpatDIF library – Concepts and Practical Applications in Audio Software

Schacher, Jan C.; Miyama, Chikashi; Lossius, Trond

~Proceedings ICMCISMCI2014 14-20 September 2014, Athens, Greece The SpatDIF library - Concepts and Practical Applications in Audio Software Jan C. Schacher Zurich University of the Arts Institute for Computer Music and Sound Technology ICST jan. schacher@zhdk. ch Chikashi Miyama University of Music, Cologne Studio for Electronic Music me@chikashi. net Trond Lossius Bergen Center for Electronic Arts BEK trond. lossius@bek.no ABSTRACT The development of SpatDIF, the Spatial Sound Description Interchange Format, continues with the implementation of concrete software tools. In order to make SpatDIF usable in audio workflows, two types of code implementations are developed. The first is the C/C++ software library 'libspatdif', whose purpose is to provide a reference implementation of SpatDIF. The class structure of this library and its main components embodies the principles derived from the concepts and specification of SpatDIF. The second type of tool are specific implementations in audio programming environments, which demonstrate the methods and best-use practices for working with SpatDIF. Two practical scenarios demonstrates the use of an external in MaxMSP and Pure Data as well as the implementation of the same example in a C++ environment. A short-term goal is the complete implementation of the existing specification within the library. A long-term perspective is to develop additional extensions that will further increase the utility of the SpatDIF format. 1 Introduction The Spatial Sound Description Interchange Format (SpatDIF) presents a structured syntax for describing spatial audio information, addressing the different tasks involved in creating and performing spatial sound scenes. The goal of this approach is to simplify and enhance the methods of creating spatial sound content and to enable the exchange of scene descriptions between otherwise incompatible software. SpatDIF proposes a simple and extensible format as well as best-practice examples for storing and transmitting information about spatial sound scenes. It encourages portability and the exchange of compositions between venues with different surround audio infrastructures. SpatDIF also fosters collaboration between artists such as composers, musicians, sound installation artists as well as researchers in the fields of acoustics, musicology, and sound engineering. SpatDIF was developed in a collaborative effort and has evolved over a number of years. The completion of a first usable version of the specification [9] defining the core descriptors and a few indisCopyright: 2014 Jan C. Schacher et al. This is an open-access article distributed under the terms of the u n n, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. pensable additional descriptors was achieved in 2012 and is published in the Computer Music Journal [8]. The community pages as well as all the related information can be found at: http: / /www. spatdif. org. 1 The SpatDIF specification was informally presented to the spatial sound community at the ICMC in Huddersfield in August 2011 and at a workshop at the TU-Berlin in September 2011. The responses in these meetings suggested the urgent need for a lightweight and easy to implement spatial sound scene standard, which could contrast the complex MPEG-4 scene description specification [12]. In addition, several functions necessary to make this lightweight standard become functional, such as the capability of dealing with temporal interpolation of scene descriptors as described, were introduced. For a complete overview of the state-of-the art in audio spatialisation tools, please refer to the 2013 article in Computer Music Journal [8], which also functions as a sort of white paper for the specifications 0.3 [9]. Since then, one mayor development in surround audio workflows has been the introduction of the proprietary Dolby Atmos format, which mixes concepts such as sound-beds and channel-based traditional panning with object based real-time panning. Dolby Atmos authoring is achieved using ProTools and the Dolby Rendering and Mastering Unit (RMU). RMU provides the rendering engine for the mix stage, and integrates with Pro Tools through the Dolby Atmos Panner plug-in over Ethernet for metadata communication and monitoring. The metadata is stored in the Pro Tools session as plug-in automation [2]. Dolby Atmos was initially developed for cinema, and more recently consumer appliances have been announced as well. Finally, one toolset deserves mention because it resembles in many ways what the development process described in this paper is aiming at. The SoundScape Renderer by Geier et al. [4] and its XML-based storage format ASDF [3] were developed in the opposite direction, going from concrete software-implementations to format definitions. This has as a consequence that some of the ASDFdescriptors are implementation-driven, which makes it less portable than SpatDIF aspires to be. 1.1 SpatDIF Basics Since SpatDIF is a syntax rather than a programming interface or file-format, it may be represented in any of the cur1 All URIs in this article were last accessed in April 2014. - 861 - 0