Synthesis of the laryngeal source of throat singing using
a 2 x 2-mass model
Ken-Ichi Sakakibara*l, Hiroshi Imagawa*2, Seiji Niimi*3, Naotoshi Osaka*l
kis~brl.ntt.co.jp, imagawa~m.u-tokyo.ac.jp, niimi~iuhw.ac.jp, osaka~brl.ntt.co.jp
*1 NTT Communication Science Laboratories
3-1, Morinosato Wakamiya, Atsugi-shi, Kanagawa, 243-0198, Japan
*2 Department of Speech Physiology, The University of Tokyo
7-3-1, Hongou, Bunkyo-ku, Tokyo, 113-0033, Japan
*3 Speech and Hearing Center, International University of Health and Welfare
2600-6, Kitakanemaru, Ohtawara, 324-0011, Japan
Abstract
Singing voices have various timbres. Throat singing and
some other Asian traditional singing voices have a pressed
timbre that is significantly different from the European
classic singing voice. In our previous study on throat
singing, the vibration of the false vocal folds as well as
that of the vocal folds was observed and was found to be
essentially due to the pressed timbre. This paper describes
a 2x 2-mass model as a physical model, defines an adduction parameterization of its parameters, and presents a
simulation of vocal fold and false vocal fold vibrations in
the larynx. Furthermore, a visual simulator of the laryngeal movements is demonstrated. By using this model,
the vibration patterns of the two different laryngeal voices
in throat singing (the pressed and karygraa voices) and
the normal pressed voice have been simulated. The results show the possibility of synthesis of various timbres
for singing.
1 Introduction
The singing voice has numerous variations of timbre. There are considerable differences, for instance,
between European classical singing voice, such as
bel canto and German lied, and the Asian traditional pressed singing voices, such as throat singing,
Japanese Youkyoku, and Korean Pansori.
The laryngeal source is an essential factor in determining the timbre of the singing voice, especially
for pressed quality. In general, the pressed quality is
obtained by excessive adduction of the supraglottal
structure. The laryngeal adjustments in Asian traditional pressed singing are much different from that in
European classic singing [4, 5, 7].
Synthesizing such varying timbres in singing
voices requires a flexible laryngeal source model.
A glottal waveform model allows us to control its
parameters to approximate the perception of voice
[6, 8]. On the other hand, a physical model allows
us to controll its parameters according to the physical and physiological mechanism of laryngeal adjustment. Based on the physiological observations, we
have constructed a 2x2-mass model as a physical
model which is devised by attaching a two-mass for
the false vocal fold to ordinary two-mass model for
the vocal folds [2, 8].
In this paper, after summarizing the physiological observations in throat singing, we describe the
mechanism of a 2x2-mass model and its adduction
parametrization. We also present a visual simulation
tool for the model. Finally, using the model, we simulate the laryngeal sources of throat singing and the
normal pressed voice.
2 Laryngeal source in throat
singing
2.1 Throat singing
Throat singing is a traditional singing style of people who live around the Altai mountains. Khoomei
in Tyva and Kh6imij in Mongolia are representative
styles of throat singing. Throat singing is sometimes
called biphonic singing, or overtone singing because
two or more distinct pitches (musical lines) are produced simultaneously in one tone. One is a low sustained fundamental pitch, called a drone, and the
second is a whistle-like harmonic that resonates high
above the drone.
The production of the highly pitched overtone is
mainly due to the pipe resonance of the cavity from
the larynx to the point of articulation in the vocal
tract [1]. On the other hand, the laryngeal voice of
throat singing has special pressed timbre and supports the generation of the overtone.
The laryngeal voices of throat singing can be
classified as pressed and kargyraa based on the listener's impression, acoustical characteristics, and the
singer's personal observation on voice production.
The pressed voice is the basic laryngeal voice in
throat singing and used as drone. The kargyraa voice
is a very low pitched voice that ranges out of the
modal register.
2.2 False vocal folds
The false vocal folds (ventricular folds) are a pair of
soft and flaccid folds which attach to anterolateral