ï~~MULTICORE TECHNOLOGIES IN JACK AND FAUST Y Orlarey Grame orlarey @ grame.fr S. Letz Grame letz @ grame.fr D. Fober Grame fober@ grame.fr ABSTRACT 1.2. Data-flow activation Two ongoing research projects at Grame aim at providing efficient solution for audio applications on many/multi core systems. The first one is Jackdmp, a C++ version of the Jack low-latency audio server for multi-core machines. The second is Faust, a compiled DSP language with auto parallelization features. This paper will briefly describe these two projects focusing on multi-core related questions. 1. JACK JACK is a multi-platform low-latency audio server. It can connect a number of different applications to an audio device, as well as allowing them to share audio between themselves. The first versions of Jack were based on a sequential activation mechanism finely tuned for mono-core machines, but unable to take advantage of modern multicore machine. The recent Jackdmp version aims at removing these limitations. The activation system has been changed for a data flow model and lock-free programming techniques for graph access have been used. 1.1. Natural parallelism In a Jack server like system, there is a natural source of parallelism when Jack clients depend of the same input and can be executed on different processor at the same time. The main requirement is then to have an activation model that allows the scheduler to correctly activate parallel runnable clients. A graph of Jack clients typically contains sequential and parallel sub-parts (Fig 1). When parallel sub-graph exist, clients can be executed on different processors at the same time. A data-flow model can be used to describe this kind of system: a node in a data-flow graph becomes runnable when all inputs are available. Each client uses an activation counter to count the number of input clients which it depends on. The state of client connections is updated each time a connection between ports is done or removed. Activation will be transferred from client to client during each server cycle as they are executed: a suspended client will be resumed, executes itself, propagates activation to the output clients, go back to sleep, until all clients have been activated.1 1.3. Pipelining for sequential graphs Sequential graphs can also take benefit of multi-core machines using pipelining techniques. The idea is to divide the audio buffer of size D into N sub parts and run the entire graph with N buffers of D/N size. For example taking a driver buffer size of 1024 frames, with N = 4, the graph is processed with buffers of 1024/4 = 256 frames. With a graph of 2 clients A and B in sequence, we have the following activation steps for the 1 to 4 indexes of sub buffers: A[1], A[2]B[1], A[3]B[2], A[4]B[3], B[4]. Running the graph with smaller buffers means more context switches with their associated cost. Thus there is a balance to find between the size of the sub buffer and the number of available processors. The overall benefit is to lower the worst-case execution ending date in a given cycle, thus possibly executing more CPU hungry graph and still avoiding audio xruns. 2. FAUST Faust is a compiled language for real-time signal processing and synthesis that targets high-performance audio applications and plugins. The programming model of Faust combines a strong mathematical semantic with a very expressive block-diagram syntax. Faust programs are translated by the Faust compiler into C++ programs. The compiler try to synthesize the most efficient C++ implementa1 The data-flow model still works on mono-processor machines and will correctly guaranty a minimum global number of context switches like the "pre-sorting step" model. Figure 1. Client graph: Client A and B could be executed at the same time, C must wait for A and B end, D must wait for C end.
Top of page Top of page