ï~~MULTICORE TECHNOLOGIES IN JACK AND FAUST
Y Orlarey
Grame
orlarey @ grame.fr
S. Letz
Grame
letz @ grame.fr
D. Fober
Grame
fober@ grame.fr
ABSTRACT
1.2. Data-flow activation
Two ongoing research projects at Grame aim at providing efficient solution for audio applications on many/multi
core systems. The first one is Jackdmp, a C++ version
of the Jack low-latency audio server for multi-core machines. The second is Faust, a compiled DSP language
with auto parallelization features. This paper will briefly
describe these two projects focusing on multi-core related
questions.
1. JACK
JACK is a multi-platform low-latency audio server. It can
connect a number of different applications to an audio device, as well as allowing them to share audio between
themselves. The first versions of Jack were based on a sequential activation mechanism finely tuned for mono-core
machines, but unable to take advantage of modern multicore machine. The recent Jackdmp version aims at removing these limitations. The activation system has been
changed for a data flow model and lock-free programming
techniques for graph access have been used.
1.1. Natural parallelism
In a Jack server like system, there is a natural source of
parallelism when Jack clients depend of the same input
and can be executed on different processor at the same
time. The main requirement is then to have an activation
model that allows the scheduler to correctly activate parallel runnable clients.
A graph of Jack clients typically contains sequential and
parallel sub-parts (Fig 1). When parallel sub-graph exist,
clients can be executed on different processors at the same
time. A data-flow model can be used to describe this kind
of system: a node in a data-flow graph becomes runnable
when all inputs are available. Each client uses an activation counter to count the number of input clients which
it depends on. The state of client connections is updated
each time a connection between ports is done or removed.
Activation will be transferred from client to client during each server cycle as they are executed: a suspended
client will be resumed, executes itself, propagates activation to the output clients, go back to sleep, until all clients
have been activated.1
1.3. Pipelining for sequential graphs
Sequential graphs can also take benefit of multi-core machines using pipelining techniques. The idea is to divide
the audio buffer of size D into N sub parts and run the
entire graph with N buffers of D/N size. For example taking a driver buffer size of 1024 frames, with N = 4, the
graph is processed with buffers of 1024/4 = 256 frames.
With a graph of 2 clients A and B in sequence, we have
the following activation steps for the 1 to 4 indexes of sub
buffers: A[1], A[2]B[1], A[3]B[2], A[4]B[3], B[4].
Running the graph with smaller buffers means more
context switches with their associated cost. Thus there is
a balance to find between the size of the sub buffer and
the number of available processors. The overall benefit is
to lower the worst-case execution ending date in a given
cycle, thus possibly executing more CPU hungry graph
and still avoiding audio xruns.
2. FAUST
Faust is a compiled language for real-time signal processing and synthesis that targets high-performance audio applications and plugins. The programming model of Faust
combines a strong mathematical semantic with a very expressive block-diagram syntax. Faust programs are translated by the Faust compiler into C++ programs. The compiler try to synthesize the most efficient C++ implementa1 The data-flow model still works on mono-processor machines and
will correctly guaranty a minimum global number of context switches
like the "pre-sorting step" model.
Figure 1. Client graph: Client A and B could be executed
at the same time, C must wait for A and B end, D must
wait for C end.