users, as well as their prospects [Georgaki, 1998a]. In order to outline the current research on synthesis of the singing voice we would like to make some remarks concerning the advantages and status of every synthesizer, along with its applicability in contemporary music: a) one of the most complete "cognitif" synthesizer is the Mu(s)se/Rulsus [Sundberg, 1987; 1989] as it is equipped with many rules that are describing classical singing. It gives to the user the possibility to produce his own sung phonemes, words and phrases of reasonably good quality, allowing for vocal expressivity [Gael, 1990, Carlsson 1991, Berndtsson, 1995] and it is used also for the synthesis of polyphonic choral singing. One of its cues is the capacity of controlling the fundamental frequency and the formants with more speed and precision than human singers but we expect the amelioration of the control interface in order to be more accessible to composers and musicians. b) other research projects, like Chant [Rodet et al, 1985, 1995] have been oriented towards more artistic applications, equipped with the proper interface and environment, in order to afford software tools to the composers or reconstruct ambivalent voices of the past [Depalle et al,1994]. More precisely, Chant [Rodet et al., 1984] is a software program designed for compositional needs rather than for scientific ones, as it stresses the continuity between sound processing and synthesis by filtering synthetic or real voices and instruments. During the last years, special research on the concatenation of the high quality of vowels obtained by CHANT by the System ABS/OLA3 (Rodet, 2002) suggest the concatenation of transition and consonants units. c) the SPASM/Singer model [Cook, P. 1993} is more advantageous compared with the others because of its proper control environment: it offers the user a subtle control on the parameters of the vocal signal which are related directly to the vocal pedagogy and physiology of speech (tongue's or lips' position, the form of the vocal tract, the vocal effort, etc.) through a user-friendly interface (the form of the vocal tract on the screen). Lastly, SPASM/singer can be controlled by a special interface Squeeze Vox [Cook, 2000] that brings the control of the synthesis of the singing voice closer to the musicians' abilities. d) Despite the importance of the research projects on the singing voice, the majority of these projects are based on the analysis synthesis of the classical singing technique allowing for some exceptions such as [Tisato 1991; Rodet 1985; Kamarotos et al, 1994] who have been studying extra-European voice techniques (Diphonic singing, Thibetan singing, or traditional Greek singing). e) The new projects carried out the during last years in academic institutes propose new ways of singing synthesis4 which is less fastidious than the previous one and exalt the score to singing synthesis as the future of the singing software development [Lomax 1996; Macon, 1997; Meron 1999, H.L-Lu 2002, Bonada and Loscos 2003, Yoongmoo 2003 ]. The first commercial score to singing software synthesizers have appeared only recently, since 2003 (VOKALOID5 by Yamaha and CANTOR6 by Virsyn) and promise the inauguration of a new era. In any case all these projects, and especially the models in which we have been referring to more extensively, differ not only in their synthesis technique or the implemented rules (describing singing) but also in the control interface and the resulting sound (every model has its own particular voice signature). They differ also in the performability of the model and its applications in the computer music field (composition, psychoacoustics, vocal training and education, or a powerful tool for performance). Some acoustic examples will give evidence for these observations. 3. Technical problems in encoding the expressive singing voice How difficult is to imitate the human "Psyche"? (Georgaki, 1998). The unique These new techniques are mostly based on the concatenation: sampled synthesis or sinusoidal models with a MIDI output. 5 VOCALOID uses Frequency-Domain Singing Articulation Splicing and Shaping, a vocal (singing-voice) synthesizing system developed by Yamaha. With this system, the "singing articulations" (collections of voice snippets, such as of phrases, and snippets of vocal expression variations like vibrato) needed to reproduce vocals are collected from custom-produced recordings of accomplished singers and put into a database after conversion into frequency domains. 6 Virsyn presented in the Frankfurt Musikmesse Prolight+sound2004a new 8-part vocal synthesizer, CANTOR(Mac/Win)- (http:/Aiwww.kvrvst.com/get/984.html) - which lets users enter words in English and play them melodically from a MIDI keyboard in real-time. According to the manufacturer, Cantor's Voice editor lets you edit the character of the virtual singer by defining the base spectrum for vowels and consonants. The application also includes a Phoneme editor and offers realtime control over vibrato rate and depth as well as the gender of the singing voice. 3 Analysis-by-synthesis /Overlapp-Add Proceedings ICMC 2004
Top of page Top of page