Page  277 ï~~Multi-Participant Interactive Music Services S.Arnold, D.R. McAuley, K.C. Sharman Centre for Music Technology The University of Glasgow Glasgow G12 8QQ, Scotland s.arnold@music.gla.ac.uk; d.mcauley@dcs.gla.ac.uk; k.sharman@music.gla.ac.uk Abstract: The paper presents an initial report on the development of a distributed network for real-time, multi-channel audio processing. 1. Preamble The Centre for Music Technology in the University of Glasgow has recently become a functioning entity after a protracted period of gestation. Its principal hardware resources are a network of 45 Nextstep computers (moto and intel), two audio labs, a research room, two recording and composition studios and an acoustics lab with an anechoic chamber. These resources are distributed across three geographically separated sites. The CMT is an interdisciplinary centre, conjoining the interests and skills of computer scientists, engineers and musicians. One of its functions is to provide an organizational and resourcing framework for joint degrees involving the participating disciplines, but this report focusses on the first substantial research project on which the centre has embarked. It is a large-scale project, providing research challenges many areas. In addition to the authors, it is staffed by a lecturer, a research assistant and three postgraduate students. Initial funding came via a "New Initiatives" grant from the University of Glasgow. 2. Introduction The Parallel Digital Signal Processing (PDSP) Engine is a powerful computing resource currently under development in the Centre for Music Technology at the University of Glasgow. The main function of this resource is to provide a high-performance platform for real-time, multi-channel processing of CD quality audio signals. In particular, we aim to develop applications for a virtual recording environment providing all the functions for recording, editing, signal processing and mixdown commonly found in a conventional recording studio. Other functions to be developed will include synthesis, composition, and audio-analysis algorithms. The PDSP engine will be a networked resource. This will enable remote user interaction and permit multiple, geographically distant users to work on communal recording projects in real-time. Simultaneous, but distinct, tasks will also be supported. Interaction with the system will be via an objectoriented graphical interface with which the user may design a hierarchical block diagram of the signal processing and routing tasks to be performed by the engine. The user interface will include on-screen controls (sliders, knobs & switches) and feedback indicators such as meters and bargraphs. In addition, we intend to develop mechanical input/output devices such as banks of sliders (like a conventional audio mixing desk), optical controllers, MIDI devices, and of course multi-channel analogue and digital audio I/O ports. These devices will be networkable so that they can be used remotely from the PDSP engine. 3. Technical Overview and Research Issues (a) The PDSP Engine The PDSP engine consists of multiple TMS320C40 parallel processing nodes hosted on a standard i486 (or higher) PC. We are currently using cards manufactured by Loughbourgh Sound Images Ltd. Each card contains 4 processing nodes. The PC acts as a server providing the network interface to the parallel processor bank. It also takes care of housekeeping tasks such as loading software modules into the processing nodes. The main digital signal processing tasks are executed on the parallel processing modules attached to the PCs peripheral bus. The TMS320C40 nodes have an architecture and instruction set optimised for digital signal processing tasks. Each node has local memory (>4 MB) and six high speed (20M bytes per second) directmemory access communications links which can operate simultaneously with instruction execution. At peak performance, each node will yield about 50 MFLOPS. We envisage that a system with at least 12 processing nodes would be the minimum required to implement the target tasks. The communications links provide the connections between the processing nodes allowing a variety of parallel processor topologies to be constructed. At the present time, these inter-processor links are fixed at installation time, but we are currently IC M C P RO C E E D IN G S 199527 277

Page  278 ï~~developing an electronic switch which will enable dynamic configuration of the processor topology according to the users' (or an optimising programs') specifications. Several of the processors on the system also have attached peripherals such as analogue converters and large capacity hard disk systems. A goal of the project is to develop software modules for efficient parallel implementation of the signal processing tasks. These will include simultaneous multichannel recording and playback of sound files from disk storage; algorithms for digital filtering, equalization, pitch shifting, time compression, and other audio effects commonly found in a recording studio. Another target, since the system has a C-compiler, is to implement a parallelised version of a music environment, such as Csound. Much of this work will need to concentrate on how these algorithms can best be implemented on parallel hardware. In a typical application, the PDSP engine will be required simultaneously to implement many different signal processing tasks. Questions of how these tasks should be scheduled and placed on the various processing nodes are a key area of this research. These processes must be transparent to the user. (b) Network Requirements High-quality interactive music-based services are amongst those with the strictest requirements in terms of granularity of synchronization and quality-of-service constraints. To allow real-time, remote audio processing, the network must be capable of sustaining a bi-directional data rate of at least N x 800K bits per second for N channels of audio. In a typical recording application there may be 24 channels of audio implying a network data rate of about 19.2M bits per second. Such services are clearly distinguished from the more widely investigated multi-party audio applications such as audio conferencing. For example, simple delay constraints are insufficient, and channel phase relationships need preserving to maintain spatial imaging. The chosen implementation uses ATM technology (Asynchronous Transfer Mode) with separate control and data paths, either internally (e.g. Desk Area Network (DAN) architectures) or in the network. This approach based on "direct connect ATM devices" has been demonstrated to be successful in providing tight time constraints by removing the operating system and other complex control components from the real-time data path, but results in a more complex synchronization problem. To date, this has only been solved for comparatively relaxed constraints, where, for example, synchronization of video and audio is performed on "video time scales" (1 frame or 40 ms). 4. Auto-Compilation and User Tools Serious consideration is being given to methods and styles of user-interaction with the system. We envisage a "permissive" environment, where trade-offs between programming flexibility and ease-of-use are userdetermined. While a graphical representation of the state of the system and graphical tool-kits are a high priority, the need to control many channels simultaneously in real-time make it clear that mouse-clicks and the traditional computer keyboard will not deliver on their own an adequate user environment. A variety of MIDI controllers can be linked into the system via the user workstations, and the user environment will allow the graphical linking of the on-screen controller representation to specific variable place holders on the graphical representation of the software device on the PDSP. While system pre-configuration is done on the computer screen, real-time reconfiguration can be effected via MIDI commands (eg program changes). In the pre-configuration process, objects may represent low-level components such as unit generators, but can also be concatenations of lower-level components into higher-level "devices", such as graphic equalizer, mixer or computer-music instrument. Since the aim is user-transparency, the translation of this specification into lists of processing tasks, and their optimal allocation to processing nodes is a major component of this project. 5. Bibliography S.Arnold. A Network for Music Research, Composition and Pedagogy in the University of Glasgow, Proceedings of the 1993 International Computer Music Conference, Tokyo 1993. P.R. Barham, M.D. Hayter, D.R. McAuley, I.A.Pratt. Devices on the Desk Area Network IEEE JSAC, May 1995. I.M. Leslie, D.R. McAuley and D.L. Tennenhouse. ATM Everywhere?, IEEE Network, March 1993. K.Linton et al. Real-Time Multi-Channel Digital Audio Processing: Scalable Parallel Architectures and Taskforce Scheduling Strategies, ICASSP 1991, Toronto May 1991. K. Sharman & E. Breakenridge. Estimation of Signal Parameters Using the Maximum Likelihood Method, IEE Colloquium on Mathematical Aspects of Signal Processing, University of Bristol, 1994. K. Sharman & A. Esparcia-Alcazar. Genetic Evolution of Symbolic Signal Models, Proceedings of the 2nd International Conference on Natural Algorithms in Signal Processing, 1993. 278 8ICMC PROCEEDINGS 1995