ï~~Proceedings of the International Computer Music Conference 2011, University of Huddersfield, UK, 31 July- 5 August 2011
ANALYSIS-BY-PERFORMANCE: GESTURALLY-CONTROLLED VOICE
SYNTHESIS AS AN INPUT VIBRATO SINGING
Media and Graphics Interdisciplinary Centre
University of British Columbia, Vancouver, BC
Institute for New Media Art Technology M
University of Mons, Belgium Un
In this paper we introduce Analysis-by-Performance, a
new methodology for addressing several signal modelling
issues. This approach studies the gestural behaviour of
a performer, while he/she is imitating a given sound effect with an appropriate digital musical instrument. New
insights observed in the performing gestures eventually
lead to a new sound production model. The Analysis-byPerformance technique is applied to the study of vibrato
in singing. Indeed for several years the HANDKSETCH
digital instrument gave performers the ability to imitate
vibrato in singing in a highly natural and expressive way.
Results from gestural analysis of the HANDKSETCH practice are presented and a new vibrato model based on glottal flow parameters is proposed.
As any other model-based engineering application, parametric sound synthesis faces two main issues, related to
modelling and parameter estimation. Modelling typically
implies to make assumptions about the sound production
mechanism. Estimating the parameters of the model requires tuning algorithms to match the output of the model
to recorded waveformins in the time and spectral domains.
Then comes the classical design trade-off between modelling and estimation eTrrors: the finer the model is, the
higher the number of parameters to be tunted, and the more
chances to fail estimating these parameters correctly.
Voice modelling is a particularly critical case for which
appropriate tuning of the sound production model makes a
huge difference on synthesis results. Indleed vocal sounds
result from a complex coupling between mechanical and
aerodynamical oscillations . Most of the actual laryngeal behaviour is still really difficult to observe both on
the vocal apparatus  and recorded audio signals .
Consequently, voice modelling has been addressed in two
very different ways. On the one hand, spectral models and
concatenative synthesis produce high quality sounds but
Institute for New Media Art Technology
University of Mons, Belgium
[edia and Graphics Interdisciplinary Centre
iversity of British Columbia, Vancouver, BC
are usually detached from any refined laryngeal description, making the integration of expressive control difficult.
On the other hand, articulatory synthesis gives full access
to physiological properties but tuning complexity has a
significant impact on the overall synthesis quality.
In this paper we introducIe a new approach to signal
modelling, called Analysis-by-Performance. We present
it as the extension of the well-known Analysis-by-Synthesis
technique  to a performative level. If a performer can
convincingly imitate a given sound effect on an appropriate digital musical instrument, we assume that observing the performed gestures brings new insights in the understanding of the studied sound effect. Then these new
insights eventually lead to proposing a new production
model. We explain this new approach in Section 2.
In order to assess our methodology, we take the modelling of vibrato in singing as a case study. Vibrato is a
sustained vibrating quality of the sound encountered in a
large amount of musical instruments. In singing vibrato
is achieved by a complex laryngeal modulation . Consequently the fine tuning of vibrato in singing synthesis
suffers from the above-mentioned voice modelling issues.
Section 3 gives a more detailed background on the modelling of vibrato in singing and discuss modelling issues.
The Analysis-by-Performance concept is motivated by
the natural and expressive vibrato imitations that can be
achieved on the HANDSKETCH musical instrument, a digital device for performing singing synthesis . Although
the HANDSKETCH voice synthesizer is based on a simple
source-filter model, the acquired skills of the performer
allow him to produce a high-quality vibrato effect. The
HANDSKETCH musical instrument and its imitative practice are described in Section 4. Therefore the impact of
these performing gestures on low-level glottal source parameters are observed, leading to a new production model
for vibrato in singing, as presented in Section 5.