Figure 4. Experimental Design for Latency Measurements. Incoming MIDI signal is routed as an analog pulse
and its transient is compared with the leading edge of the
synthesized waveform of a percussion instrument.
compute the power signal over small audio windows (22
samples or 0.5 ms) and look for abrupt drops (and subsequent recovery) in the power signal. The number of
glitches as the buffer size is varied is shown in Figure 3.
The audio filter detects some glitches that are not indicated as under-runs, and conversely not all under-runs
are detected as glitches. However, generally speaking, the
results are in close agreement, and the Metronome-based
synthesizer is also glitch-free down to 1ms.
These results match our informal listening tests: each
audible artifact is also detected as a glitch and corresponds
to a reported under-run. However, for some reported
under-runs and glitches, we are not able to hear them. We
suspect that they are too small to be audible, for example
very few skipped samples.
5.3. Latency and Jitter
Having established that the Java-based synthesizer is able
to produce clean audio running at a period of 1ms, we set
out to determine its absolute end-to-end performance, determine the contribution of various portions of the system
to latency and jitter, and finally, to compare it to an existing hardware synthesizer.
To perform the measurements accurately, we used a
variant of the method developed by our colleague James
Wright [23], as shown in Figure 4. A MIDI signal is
sent from a computer acting as source to a Midi Solutions
"Thru" box, modified to route its input signal out to an
RCA jack as well as to the MIDI OUT port. The MIDI
signal is then routed to the system under test.
We tested both MIDI-to-MIDI and MIDI-to-audio
paths. For MIDI-to-audio, we synthesized a percussive
instrument with a sharp attack. The system's LINE OUT
signal is routed to the right channel of the sound input
of another computer. The RCA output from the modified
Thru box is routed to the left channel. Comparison of the
leading edges of the two signals produces a quite accurate
measurement of the total system latency.
For MIDI-to-MIDI latency, we used a program that
simply echos all MIDI messages from the sound card's
MIDI IN port to the MIDI OUT port, which was connected to a second modified MIDI Thru box, and also used
left/right comparison to measure the total latency.
Min Mean Max Std. Dev.
MIDI (C) 0.340 0.347 0.362 0.011
MIDI (Java Sound) 0.385 1.455 3.197 0.701
MIDI (Java/Direct) 0.385 0.406 0.430 0.011
Kurzweil K2000R 2.925 3.909 4.897 0.570
Metro ims 4.240 4.959 5.736 0.317
Metro Ims/FSync 5.396 5.847 6.439 0.308
Metro 365ps/No GC 2.947 3.120 3.310 0.109
Table 1. Latency Summary
The results are summarized in Table 1. Echoing a 1 -byte MIDI message from a C program took about 350 +
10ps. Since MIDI [20] transmits one byte per 320ps,
this is quite fast. Furthermore, real synthesis will involve mostly 2- and 3-byte MIDI messages, adding at least
640ps to the total latency.
We first tried a similar Java program which used the
standard Java Sound [17] APIs to read and write the MIDI
messages, but as can be seen, this produced unacceptable
results with latencies up to 3.2ms. This can probably be
attributed to Java Sound's polling implementation, which
uses Thread.sleep(1) in between polling for new MIDI
messages. As a result, we wrote our own Java MIDI input
layer, which calls the ALSA drivers directly via JNI. This
improved things considerably: the Java program now only
increases worst-case latency by 70ps over C. Java Sound
is still used for MIDI output.
As a point of comparison for end-to-end performance
of our system, we measured the widely used Kurzweil
K2000R synthesizer, which we found to have an average
latency of 3.9ms with 2ms peak jitter.
Running the synthesizer (Harmonicon) on the realtime JVM with a 1ms buffer and the memory load thread
active, the system achieves 5ms latency with 1.5ms peak
jitter - slightly slower but also slightly more stable than
the K2000R. Overall, the two systems are roughly comparable in their achieved performance.
We also measured the effect of using forwardsynchronous [6] scheduling of the arriving MIDI notes
("FSync"). While this produced a significant reduction
in peak jitter, to 1ms, it also came at the expected expense
of 0.9ms of additional latency.
Finally, to isolate the impact of garbage collection, we
ran the system with the memory load thread off and a large
enough heap that it never triggered a garbage collection
("No GC"), with the buffer size set to 365Aps (16 samples),
at which our previous tests indicated it could run without
buffer under-runs. At this setting, the system improves
significantly: 3.1ms end-to-end latency with only 360Aps
peak jitter.
6. RELATED WORK
6.1. Real-time Audio
Several real-time audio frameworks were developed in
Java. They usually resort to a native language like C
108