ï~~Proceedings of the International Computer Music Conference 2011, University of Huddersfield, UK, 31 July - 5 August 2011
PVOC KIT: NEW APPLICATIONS OF THE PHASE
VOCODER
Tom Erbe
UC San Diego
Department of Music
9500 Gilman Drive 0099
La Jolla, CA 92037
http://www.soundhack.com
ABSTRACT
This paper describes applications of the phase vocoder
algorithm as developed for the plugin bundle Pvoc Kit.
The plugins in this collection explore pitch shifting,
time stretching and phase modulation of live signals.
Several new techniques using these algorithms have
been developed and will be described.
1. INTRODUCTION AND MOTIVATION
The phase vocoder has been in common use in computer
music for the past 30 years. With tools like AudioSculpt
[1], pfft~, FFTease[7], and many others available, it is
not difficult to access techniques for time stretching,
pitch shifting or more esoteric spectral transformations.
Most of the tools available run as stand-alone
applications, as externals or abstractions for Max/MSP
and Pure Data, or as patches under other computer music
languages such as Kyma, Supercollider, chuck or JSyn.
Consequently there is a very rich tool kit for the
exploration of phase vocoder techniques for those
working in computer music languages. However, there is
much less available to composers, computer musicians
or sound designers who use sound-file centered
applications such as Ableton Live, ProTools or Logic as
their main compositional tool. The motivation for
developing Pvoc Kit is to bring the same richness of
tools to this environment, and at the same time to
expand on the set of known techniques.
Earlier the author developed the Spectral
Shapers plugins, a set of phase vocoder based filters
which for the most part only modulate the amplitude of
each spectral band. This technique is suitable for timbre
modification, detailed filtering and spectral dynamics. In
contrast, the current work explores the phase vocoder's
ability to modulate time, pitch and phase and by
extension harmonic content, sustain and ambience.
These processes are implemented as a set of four audio
plugins in the VST, AU and RTAS formats called
+phasemash, +pitchsift, +spiralstretch and +loopool.
This paper will describe the function of the first three
plugins and detail the various sound processing
algorithms developed within each one.
2. PVOC KIT: COMMON FEATURES
All of the plugins use the phase vocoder as describedby
Mark Dolson [2]. The input samples are shifted in and
rotated to provide proper phase alignment in preparation
for further pitch or time processing [5]. Blackman and
Kaiser windows are used on the input and output for their
good sideband rejection. The phase vocoder is using an
overlap factor of 8 to minimize the error in instantaneous
frequency estimation.
All of the plugins have a user control for the number
of bands. This is a key control when one works with the
phase vocoder, as the tradeoff between frequency
resolution and time resolution is a constant concern. This
control ranges from 8 to 8192 bands (16 to 16384 points
in the STFT), in power of two increments. This covers
the sweet spot from 512 to 2048 bands where time
resolution and frequency resolution are both pretty good.
However, it also goes far beyond, where the phase
vocoder is providing extremely smeared time or
extremely distorted frequencies. These ranges will not be
useful to all composers, but as many enjoy pushing the
limits, it seemed appropriate to broaden them. Aside
from this, all the parameters have been scaled to provide
a useful continuity of transformation, and almost all are
internally interpolated.
3. PITCH SHIFTING AND SIFTING
Figure 1. +pltchsltt user intertace.
The plugin +pitchsift is the classic phase vocoder based
pitch shifter with a number of enhancements. Pitch
shifting is provided with a large range, 4 octaves up and
4 octaves down. At the far extremes of pitch shifting, the
resultant sound bears little resemblance to the original.
There are controls for octave, cents, and pitchshift, with
pitchshift reading either in semitones or in commonly
used just ratios. Internally the plugin has two synthesis
engines which can be switched. One is based on a bank
of sine wave oscillators, and the other is based on an
inverse FFT. In the sine bank oscillator synthesis
method, amplitude and frequency are linearly interpolated
for each block of samples. With this method, the amount
of computation needed naturally goes up as the number
of bands is increased, and for this reason the partialgate
143