Arcsyn: An Expressive And Efficient Additive Synthesis Architecture

Romblom, David

ï~~ARCSYN: AN EXPRESSIVE AND EFFICIENT ADDITIVE SYNTHESIS ARCHITECTURE David Romblom Media Arts and Technology University of California, Santa Barbara romblom @gmail.com ABSTRACT Arcsyn is an architecture and control system for expressive additive synthesis; it provides satisfying dynamic behavior, compelling transients, and non-static tones. The complexity of additive synthesis is encapsulated within a musically sensitive control system inspired by instrument physics. Timbral information is represented in the frequency domain, and can be specified parametrically or made to use popular formats such as SDIF. 1. INTRODUCTION Arcsyn is a software synthesis technique that addresses the shortcomings of electronic instruments when compared to acoustic instruments of the classical canon. It is conventional in the fact that it provides the performer with tools to create expressive melodic content, and that it in some ways acts like an acoustic instrument. It is more modern in the fact that the timbre can be arbitrarily defined, that timbre can "morph" to other arbitrary timbres, that it can be sequenced using "analog" control signals, and that it has novel and exotic modulation capabilities. Arcsyn is meant as a general synthesis method that begins where the subtractive paradigm meets its limitations. As opposed to more exotic synthesis techniques, it is an elegant representation of complexity and uses computational resources efficiently. Using an Analysis/Synthesis package such as SPEAR [2] one can see a number of interesting aspects in the recording of a given note. In addition to expected features such as vibrato and weaker high frequency partials, one also observes that all partials have random motion in amplitude and frequency, highly varied attack and release times, and are generally inharmonic during attacks. When thinking of the many notes that might make up a musical phrase, we are assisted by Strawn's [6] summary of note to note pitch transitions. Arcsyn uses steady-state spectral information and models the transitions and temporal variation. Synthesis is implemented as a bank of bandwidth-enhanced sinusoidal oscillators [1]. The steady-state spectral information can be specified parametrically, taken from an Analysis/Synthesis package such as SPEAR or Loris, or imported from an SDIF [9] library. This allows for both acoustic instrument emulation and instrument-like electronic tones. For a given timbre, Arcsyn uses both ff and pp steadystate spectral information. One can interpolate freely between these two layers, allowing tones to swell or decay regardless of the onset dynamic. The ability to arbitrarily define the spectral envelope of the pp and ff layers allows non-trivial evolution in playing dynamic. The architecture can be easily extended to include the additional detail of multiple dynamic layers. Spectral morphing is achieved by adding a second, entirely distinct timbre with its own dynamic information. Here again, it is easy to extended the architecture to include multiple timbres. The third dimension of interpolation is pitch; new spectral information is specified at each welltempered note. For acoustic emulation, pitch interpolation preserves instrument formants and yields "acoustically-correct" vibrato and glissandi. For electronic timbres, we gain instrument-like playing characteristics. 2. BEYOND SUBTRACTIVE SYNTHESIS Subtractive synthesis is a pervasive synthesis architecture, where varied harmonic content is achieved by moving the cutoff frequency and resonance of a resonant lowpass filter. There are a small set of standard waveforms, the harmonic vocabulary of each waveform is predetermined by the filter's sweep range. This vocabulary is not sufficiently large to be musically satisfying, though non-linear distortion [3] and waveform modulation do help the situation. Samplers are a close relative of the subtractive synthesizer that operate by playing back recordings "frozen" at a given pitch and dynamic. For acoustic instrument emulation, glissandos and vibrato are modeled by varying the playback speed; this method shifts the instrument's formant resonance up and down resulting in a very unnatural sound. A preferable method, one closer to the physics of acoustic instruments, would preserve the instrument formants while the individual partials of the tone move under them. [7] Different recordings are played back depending on the note-on velocity. This presents problems for sustained instruments where a performer may wish to start softly and then swell: they are limited to the harmonic content of the sample they first played. To address this, modern samplers cross-fade to a second recording, though this can sound unnatural due to phase misalignment and requires detailed, case-by-case control programming. Stasis in the tone is addressed by modulating the filter cutoff or oscillator pitch with a control signal. These 478 0