Page  00000001 Studio Report of Computer Music Research Group Naotoshi Osaka Keiji Hirata Takafumi Hikichi Information Science Research Laboratory NTT Basic Research Laboratories {osakaAsiva,hirat anefertiti,hikichiAidea} p http:/ / {osaka,hirata,hikichi } / Abstract The Sound Representation and Computer Music Research Group at NTT Basic Research Labora tories started computer music research last year. activities, and the facilities of the Laboratories. 1 Introduction This report introduces computer music related activities and the facilities of the Sound Representation and Computer Music Research Group at NTT Basic Research Laboratories (NTT BRL). BRL is located in Atsugi, Kanagawa Pref., which is approximately 45 km west of central Tokyo. The group was established in July 1996. The group's mission is 1) to develop sound processing technology which serves to create multimedia content, 2) in a narrow sense, to develop technologies for computer music, and 3) to do AI research taking music as a subject from various intelligent activities of human beings. The concept common to all music research is the "interaction of music and science technology" Purposes 1) and 2) serve as technology for the sake of new music creation. Purpose 3) serves as technology spurred by and derived from the universality inherent in music performance or music structure. The group consists of Naotoshi Osaka (Leader), Keiji Hirata, Ken-Ichi Sakakibara and Takafumi Hikichi. 2 Research activities 2.1 Timbre morphing (N. Osaka and T. Hikichi) The research area for 1) and 2) centers on the timbre control of sound. The objective of this research area is to introduce new technologies and tools for computer This report describes recent research and other music and multimedia art. Our present interest is a sound morphing technique. This technique is being studied using both a signal model [1] and a physical model [2]. "Morphing" is one of the techniques in visual processing. Sound morphing, that is, timbre interpolation is also being recognized as one of the important sound synthesis techniques in computer music and speech research. We use a sinusoidal model as a signal model and morphing is done by an interpolation of the model's parameters. The algorithm is based on automatic processing, focusing on an optimum correspondent opponent search in two pairs whose number of members is different. A sound synthesis method based on physical modelling is also expected to generate high quality sound. Timbre morphing based on physical modelling is done by physical parameter interpolation. Our present study covers the sounds of struck strings, plucked strings and elastic media. These morphing techniques are applied in some musical pieces, and their performance incorporates morphing on stage. Until now, mainly sound synthesis algorithms have been studied, but not for the controller of sounds. This technique as well as sound synthesis is very important for creating new musical instruments. Applications have therefore been limited to off-line use such as sound attachments to computer graphics. We have also started a real-time timbre controlling project [3]. All the controls are performed by biosignal and physical actions. Although the controller itself re

Page  00000002 sembles a system like BioMuse [4], the overall system is completely differeilt. One feature of BioMuse aild other similar systems is that coiltrolled data is tralismitted via MIDI sigilal. JR our system all the humall body-related allalog data is reserved for the filial stage of mappilig to the acoustic data. This is expected to deliver more sophisticated sound coiltrol. 2.2 Herbie-kun (K(. Hirata) The research ill relatioll to purpose 3) is musical illformatioll processilig as a computer sciellce. Our primary concern alolig this line is to illvestigate musical illtelligeilce aild to build a practical musical system from a computer sciellce poillt of view[5]. JR more detail, through buildilig a jazz music system that achieves voicilig of chord progressiolls, reharmornzes the skeletoll of voicilig alld SO OR, we are illvolved ill clarifyilig what musical kilowledge supports jazz harmolly theory aild how it works ill the sense of computer sciellce. Our approach employs a deductive object-orielited database techilology aild case-based reasollilg; recelltly, we have implemeilted part of a prototype system to justify our approach. 3 Related research topics There are several other research projects which are relevailt to computer music research. The followilig are some of the ellviroilmeiltal research projects rullRilig at BRL. * Recogllitioll aild learniilg for mixed-media illformatioll (Leader: H. Murase) The project ilitroduces a new, robust patteril recogilitioll framework to tackle the problems of mixed-media illformatioll such as documeilt images, video images, speech, music aild other audio data ill the real world. Especially, the "segregatioll of sound from mixed audio data" [6] ( aild the "recognitioll of musical ilistrumeilts, musical Rotes aild chords" [7] [8] ( are topics relevailt to computer music study. * Auditory mechallism (Leader: T. Hirahara) The project aims at reachilig a better uniderstaildilig of how the humall early auditory system processes acoustic stimuli, such as speech, usilig Reurophysiological, psychophysical aild computatiollal modellilig approaches. One of the research topics is dyllamic adaptatioll ill auditory perceptioll[9]. ( These studies will give hilts for illveiltilg key techilologies ill Rot only speech/hearilig ellgilleerilig fields, but also ill music ellgilleerilig fields, as well as creatilig new musical idioms. * Speech productioll mechallism (Leader: M. Hoilda) This research project aims at developilig all articulatory-based speech illformatioll processrng system through modelilig of humall motor coiltrol aild aero-acoustical processes. The project has developed a reliable measureinent system for obtaillilg articulatory illformatioll ill humall speech productioll [10] ( They have also studied a computatiollal model which coilverts pholleme-specific speech motor tasks into speech acoustics based OR the articulatioll. These studies are expected to Rot only open froiltiers ill the field of speech illformatioll processilig research, but also to serve as refereilce works for singing voice syllthesis study. * Dialogue ullderstaildilig (Leader: T. Kawabata) JR order to make Ratural commullicatioll betweell humalls aild machines possible, we are coilductilig research OR clarifyilig a process ill which the illteiltioll of the speaker is coRveyed to the listeller ill a dialogue. The main concern of this research is spokell lallguage, Rot writtell text. The research, especially, includes linguistic and prosodic analysis of characteristics iR spokeR language, explanatioR OR the coordinatioR betweeR dialogue participants, and clarificatioR of inteRtioR recognitioR and utterance productioR[11] ( Output of this research would also suggest a mechanism of musical ensemble, especially iR improvizatioR. Much of the research output which incorporates prosody cart be applied to algorithmic compositioR for musical ensemble, because it is Ronverbal communicatioR. 4 Other activities 4.1 Sound database As stated iR SectioR 2, timbre cOntrOl and synthesis technology is one of the most important techRologies iR the group. Many kinds of sound data are necessary for sound analysis/synthesis study and as sound materials of computer music compositioR. Sound database cOnstructiOR is also another important activity iR our group. We have started collecting

Page  00000003 the sound materials needed for sound analysis experiments. The sound materials we are recording include: * musical instrumental sounds * sounds uttered by insects, birds, and animals. * singing sounds * natural sounds: water, wind rain, etc. * various kinds of noise * sounds of office room environment * sounds in daily life * sounds of toys Sounds are basically recorded by 48kHz stereo DAT and stored in a computer. These data are collected on demand. The present data amount is approximately 5.4 GB. These data are linked with the speech and dialogue database of NTT BRL. 4.2 Music database This year, we started to collect music signal data. This task is tied to the speech and music signal coding research group in NTT Human Interface Laboratories. We need 1)copyright free (both composition and performance), 2)short duration (approximately 20 seconds), 3)standardized music signals. Music played by three to four performers focusing on chamber music is being recorded now. Some of the music signals from these pieces will possibly become international standards in music coding research. 4.3 Computer music symposium We participate in two types of computer musicrelated activities. 1. Holding regular meetings for cross-fertilization of music information science and computer music concert We support a joint meeting among the SIGMUS (Information Processing Society of Japan, Special Interest Group on Music) which hosted ICMC'93 in Tokyo, the Acoustical Society of Japan, and the Institute of Electonics, Information and Communication Engineers. We were involved in a two-day meeting February 20-21 of this year which was held at NTT Atsugi R & D Center. A computer music concert was held on the evening of the first day. Another meeting of the same kind is being planned for next February. We plan to hold meetings regularly. 2. Computer music symposium We plan to have a first symposium at Mielparque Tokyo and abc Hall, Shibakoen, Tokyo, on November 13, 1997. In the symposium, the activities of the computer music related research and its application to music at NTT BRL will be introduced, and the strategy of music research execution at BRL will be presented. Max Mathews, CCRMA, Stanford and Joji Yuasa, Composer, will join us as invited guest speakers. 4.4 The relation to ICC (Intercommunication Center) NTT ICC (, http//www.ntticc. is located in Tokyo Opera City, Shinjuku, Tokyo. Tokyo Oera City is an international theatre city which contains, besides ICC, a new national theater where opera, ballet, comtemporary drama and comtemporary dance are performed; a concert hall large enough for orchestral music; and an art museum. ICC opened on April 19, 1997. The new national theater, the concert hall, and the art museum will open in October 1997, September 1997, and March 1999, resepectively. ICC is a museum promoting dialogue between science technology and art based on the fundamental theme of communication. Its activities consist of exhibitions, performances, workshops and symposiums. Computer music is also within its scope. However, specific events related on computer music are not running at present. ICC does not have the function of an institute and dose not carry out any research activity. Instead, NTT Research Laboratories has cooperated in several ICC organized events. In the near future, our group is expected to have a strong connection with ICC, and provide technical cooperation in ICC organized events, tutorials and some of plans we want to realize. 5 Facilities Our facility includes a multimedia auditorium where the meeting introduced in Section 4.3 was held, a music studio, and other booths such as an anechoic room, and four listening/recording rooms. The booths are shared with auditory perception researchers. 5.1 Studios Most of the facility, except the computers and their network, is in the Human Information Science Research Complex. It has a floor area of 1389 m2 a main span of 7.2 m, and a max height of 9.82 m.

Page  00000004 Irt this experimerttal buildirtg, the facility is divided irtto two research zortes: brairt magrtetic field research artd acoustic research irtcludirtg hearirtg, speech, dialogue artd (computer) music research. Each zone has research specific chambers or booths with a corttrol room artd one or more computer rooms. Irt the acoustic zone, a special duct system has beert irtstalled ort the secortd floor to provide quiet air cortditiortirtg, artd all experimerttal rooms have a double shell structure. The artechoic room is 5 P3 artd the iliverse square law holds well. Four of the sourtd booths are desigrted as dead rooms. The music studio corttairts a Sourtdcraft 6000 mixer (36ch irt/24ch out), a Mark Levirtsort amplifier (Mortaural N020.6L) artd a JBL M9500 speaker system. A JBL or Bose speaker array system is also ready. 5.2 Computer system For computers, fast arithmetic computer rtetworks usirtg HP K260 series with 9000 CPU play a most importartt role ill sourtd syrtthesis study. Argoss 9410 is used as a file server for the sourtd database. Surt Ultraspark 3000 is used as a parallel music irtformatiort processirtg workstatiort. For music performartce a persortal computer or equivalertt class computer with DSP is prepared: Three (NeXT + ISP W)s, a PC with DSP (C40) for the realtime cortversiort of biosigrtal artd physical actiorts irtto sourtd. For persortal use, researchers choose their machirtes accordirtg to their own taste; for example, Irtdy, Power Mac, 11P735, 11P712 etc. 6 The Future At presertt, the sourtd syrtthesis approach artd music irtformatiort processirtg approach are Rot lirtked, but effort will corttirtuously be made to apply the output of the music irtformatiort processirtg approach as a tool for creatirtg new art. For database, both sourtd artd music database are plartred to extertd corttirtuously oil demartd. As art applicatiort of research output, we will use the output Rot ortly art aid or idiom of corttemporary computer music compositiort artd stage performartce, but also as acoustics or music used for broadcast programs, movies, dramas, dartce etc. ill more commercially-related fields. Although computer music research group is limited irt size, it is expected to cooperate with ertvirortmerttal researchers artd bear fruitful outcomes. 7 Acknowledgments The authors would like to express their gratitude to Dr. Kertichiro Ishii for his ertthusiastic support. References [1] Osaka, N.,44Timbre irtterpolatiort of sourtds usirtg a sirtusoidal model," Proc. of ICM2C95, pp. 408 -411, Bartff, 1995. [2] Hikichi, T. artd Osaka N.,44Morphirtg of sourtds of the struck strirtgs, plucked strirtgs, artd elastic media," Tech. Rep. of JEICE, 5P96-111, pp. 23 -28, Feb. 1997 (irt Japartese). [3] Osaka, N. artd Hikichi, T.,44Auralizatiort of physical actiorts artd biosigrtal," Proc. of ASVA9Yrtnterrtatiortal symposium ort simulatiort, visualizatiort artd auralizatiort for acoustic research artd educatiort. pp. 402-406, Tokyo, 1997. [4] Tartaka, A., "Musical techrtical issues irt usirtg irtteractive irtstrumertt techrtology with applicatiort to the BioMuse," Proc. of ICM2C93, pp.124-126, Tokyo, 1993. [5] Hirata, K., "Represerttatiort ofjass piarto krtowledge usirtg a deductive object-oriertted approach," Proc of ICMZC 96, pp. 244 - 247. Hortg Kortg, 1996. [6] Nakatarti, T., Goto, M. artd Okurto, H. G.: Localizatiort by harmortic structure artd its applicatiort to harmortic sourtd stream segregatiort, Proc. of ICASSP-96, Vol.11 pp.653-OSO, May 1996. [7] K~ashirto, Kt. et~al: "Orgartizatiort of hierarchical perceptual sourtds: music scerte artalysis with autortomous processirtg modules artd a quarttitative irtformatiort irttegratiort mechartism", Proc. of IJCA 1-95, 1, pp.158-164 (Aug. 1995). [8] K~ashirto K(., artd Murase H.: "A Music Stream Segregatiort System Based ort Adaptive MultiAgertts", Proc. JJCAJ-97 (Aug. 1997). [9] K~ashirto, M. "Adaptatiort irt sourtd localizatiort revealed by auditory after-effects," Proc. 111k Int'l Symposium on Hearing (at press) 1997. [10] Kaburagi, T artd Hortda, M.: "A model of articulator trajectory formatiort based ort the motor tasks of vocal-tract shapes," J. Acoust. Soc. Am. 99 (5), May 1996. [11] K~awamori, M., K~awabata, T. artd Shimazu, A., ""A phortological study ort Japartese discourse markers II," TR JEICE, 5P95-83, 13-20, 1995.