Page  00000519 Network Audio Performance and Installation Atau Tanaka NetFIVE Ltd. Tokyo, Japan (trrcllL~c''rma.stanford. edu Abstract This paper describes a ser~ies of performnance works in the area of network mnusic. The wor~k presented spans a per~iod of timec alnd a rang~e of' dli flerent infr'astructures and technologies. As network tecnoogescotiuetoevlv. we canI draw fromn these experie~nces ins~ight into musticanl issues~ surrounding thle practice of' nIuIsic: on net~works. 1. Introduction The work pr~esented repre~sent~s a period of timelc of five years. 1 994-99. The network infrastructures used include ISDN telephlony and Internet. Transmnitted data includes control data, audio, video, and graphics. Despite this range of different approaches. musically we have held to a specific model: concert performance. The projects were live performances connecting remote: locations where the network connection arrived onstage. Although network topologies affor~d the possibility of a broader conceptual re-thinking of musical presentation '. a deliberate choice was made to pr~eseiyve the established model of concert perfonnance and to in vestigate the implications of applying nietwork technology in this context~. Three types of connection mecthods are discussed ther~e - point-to-point audio/!video. IP based contr~ol data, anld IP based streaminul audio. 2. ISDN Videoconferencing The first body of work consists of a series of' concerts using videocont'erencing technology; over ISDN telephone lines. Data bazndwidth was typically 128kbps. over which the 1-.320 protocol 2 was employed. Ot'f the shelf vi deocon feren ci ng Iiardwar~e andl so ftware were used. typically to rea~te point to point connections. Con~certs Irelli~i~ed inl this man~neri incl-ude connectionls betwe~en P~aris- Ne~v York. Mladrid-Tokyo. Tokyo-Par~is. Trokyo~-New York. Toky~o-Dijon. Bal~tlrlcloa-Am1sterdam.l~ and Ba~rcelona-M~orntreal. These events used bi-di r cti ional trt'llis~inissiot of' comnpressed audio anld \ ideo signals between two points (fig. I). Inr sev~erl' cases. there wereu exceptionls: Twice there~c were three~ point conlnections es~tablishred (Japanl-FranrceAustralia. and Q~uebec-Q~uebe~~t c-ll-?lol lnrd)an on two other occasionis there~1 wa1s iher uISe of'an additional 1I Dl data line: (Thokyo-Bro~klyn.I Barcel ona-M~ontreal). Although it can be argued that a two point connection does not constitute a nletwork, important lessons in remnote musical performance were gained in these experiences. They include issues of delay, synchironization, and musical congruence. It was observed that two types of data delay exist: codec processing delay and network transmission delay. Codec processing delay is constant, and dependent on processor speed. Transmission delay on a dedicated line connection such as ISDN telephony is qluite consistent once the connection is established but vanies with each connection. This changes considerably with a migration towards IP networks. One feature of of the integrated videoconferencin~ systems used inl these concerts were the synchronizing of image and audio transmission. In other areas, it became quite clear that this off the shel f equipmnent was not designed for musical applications. Carefi~l adjustment of certai n encodi ng and transmission parametet's, however,. m~aximized the systems' m~usicality. 3, W Cient/Server Another series of work was r~ealized using the Max environment ' and the W Pr~otocol '~. In this series of performances, called LaserOscillators, there was no audio nor video data transmnitted. Instead. N'lax control messages wecr~e transmitn ted over a TC P/I P connection to mnultiple sites. Recent concerts have incljided MLontr~eal-Utrecht-Stock~holm an d L isbon -LCtrech t-To kyo. The: W Protocol implemnents a client/server systemn akin to a chat system. The client runs as a MLax ob~ject in a patch at each concert site, withi thle server program running under Unix. Eachl client connects to the server and joins a chanlnel (chat r~ooml). Data fr'omt each client is trannsmnitted to the ser~ver, which then r~elays the data to all connected clients. Timle delay is ditfferlent for~ each client-servter~ conrnection, depending on the network ICMC Proceedings 1999 -1 - - 519 -

Page  00000520 " " U;fulISON line(It2 bpe _ pre Wer iTI t -...... """ J-"^jux -nd LR Iound SySn Afi. I. ()n0 v ile in ttn I/)N cwc.rl "distance" and congestion from each client to the server. TCP data transmission packetizes data, and was observed as data bursts and clustering of musical control data. As all transmitted data was control information, sound synthesis is done locally. Thus a remote player's actions are represented via control signals, and synthesized local with respect to his partner (fig. 2). Within the Max program, a text typing interface was created to allow verbal communications among the performers. This verbal data was interleaved into the musical control stream. 4. MP3 Streaming The third type of network concert realized made use of live audio streaming using MPEG I Layer 3 format. A concert was realized connecting Paris-Hamburg- Vienna-Tokyo. At each site were installed both an M P3 streaminei encoder'server 5 and multiple M'IP3 clients. The servers stood by. ready to sert c streams tupon receipt of requests tfrom clients. Once a strecam was initiated, the music from that stuge was served to the remote site. Each site received streams from each of the other sites. Time delay and synchronization in this case becomes a multi faceted problem. In addition to the encoder latency and network transmission, time was the added layer of server latency The encoder formats incomini, audio to MPIP fonnmat, and redirects the data to the sert er. In most cases this is simply another process running on tlhe same machine. but the encoder and server could be two separate machines in) two separate network locations. The sert er relays the stream received by the encoder. At each request from a client, it serves a new stream. It is unclear what the time offset is between different parallel streams are. On the client end is another layer of time latency - the client typically buffers up to 28KB of data before it starts playing. In this concert, separate channels were established for audio, visual and verbal communication. Webcams provided visual indication of what was happening at each site, and Internet Relay Chat provided a means for written communication between stages. The fact that these were separate processes happening on independent streams had interesting consequences. The most apparent was that the visual images were not synchronized to the audio stream. I! W-43a-j ~-5. Disc, - - I-~ - Performing music on these three different systems yielded vastly differing musical results - much akin to playing music on different musical instruments. The most radical case is with LaserOscillators. the performance using the W clientiserver. As the transmitted information was reduced to just data. the choice of sound materials also became reduced, to pure sine waves. The control signal passed via the W server were frequency controls for the single oscillator representing each site. detunings of which caused beating patterns creating acoustic resonances at each concert site. The choice was not due to technical limitations, but was rather a musical decision based on the nature of the medium. Despite the differences among these three approaches. some common elements emerge. The most compelling characteristic of network - 520 - ICMC Proceedings 1999

Page  00000521 ----~11~~-~1~~~-~--~~ ~ p~lsnrrr----Ls~~--~-~~-~~---- ~DYIIIU1~IIIIII\ Global String Atau Tanaka Kasper Toeplitz System diagram N _ __ _ __ ____~_... 1 I I F _ &) fi l 3 (ohoIl Siring fKasper Toup/it:. Alan Tanaka) based performance is time delay. The instinctive reaction of a musician is to try to improve the system by minimizing this delay. One can argue, however, that this is a misplaced motivation. The time delay can instead be regarded as the "acoustic" of the network, much in the way the reverberation time in a cathedral differs fiom that of a jazz club. Musically accomodating this delay may lead to the creation of a music that is idiomatic to the medium 7. The fact that delays are often different along each leg of the connection means that relative synchronization of musical elements differ at each site. particularly when dealing with multipoint connections. This leads to an interesting phenomenon of non-congruous simultaneity. Whereas the real time nature of the performance implies one music, the result is heard quite di fferently at each site. creating the interesting musical dynamic of a single music with simultaneous multiple interpretations. The ditTerent types of connection contributed insight into the role of\ isual communication among the performers. With regard to image quality, fluidity took primacy over image resolution. The image component's contribution was effectively nullified unless the image was synchronized with the audio. These are concerns dealing with the simple case of direct visual communication extending the notion of eye-eye contact in performance. Other, more abstract applications of visual imagery include a performance using a live network score in which this author took part 8, as well as visual abstractions of musical gesture 9. Combinations of the different techniques above yielded interesting results. To realize a three point ISDN connection, the hub architecture native to the videoconferencing system was not musically appropriate - rather than mixing the audio from the multiple sites, the system implemented an algorithm that favored the site who was broadcasting the loudest. Instead, in the three-way ISDN concerts, we assigned one of the sites to be a hub..At this site were two independent ISDN systems, making a separate connection with each of the other sites. The local audio and video was then mixed with the incoming audio/video from one site to be transmitted to the other. This added an extra layer of complication to the time delay of reception at the endpoints. We also experimented with the use of both control data (in the formn of MIIDI) alongside ISDN audio/video. As the data was sent under separate independent channels, there was no guarantee of synchronization. However, the tangible nature of seeing remote MID( events manifested locally (especially on an instrument ICMC Proceedings 1999 -521 -

Page  00000522 like a Diskcla7vier M~IDI piano) while seeing that ev~ent be ar~ticulated over~ video gave thle listener a rea~l perception ot'distance. Hearing that samne gestur~e over~ the audio connection created interesting visuanl/alural/midii echoes. Comm rictinitonl between the per~fort ler~s outside the performanace itself is essential. Inr the videocon ferencing connections. a significant amnount of connection timne is spent at the b~eginnling and end inl greetings adti confirmnations. in the W or M~P3 connections, this is replaced by the typing'chat interf'ace. Ini the case of teh W connection. the~ patchi used interleaved the typing inl thet mnusical control streamn. This meant that if the connection went down, that it was dr~opped both for~ verbal as well as mnusical communications. InI the: N'1P3 connection, since the audio streamner was independent of the chat software. troubleshooting was facilitated so long as the net connection itself stayed up. In all cases. the ultimate safeguard was to have a standard telephone connection reserved for~ verbarl communication alongside the musical connection. 6. Current work Current work is branching out into several directions: higher bandwidths, new str~eamlinu software, and non-perfonrmative applications. The Audio Engineering Society (AES) has published a whitepaper '" oultlining future possibilities f'or audio transmission onr [n ternet2. Custom Nil P3 encoder/dcoder~s arle being progr.ammned ". including: streaming audio objects for N#ISP. UDP offers inter~estiniy advantages in mnusical data transmnissionr that decrease packetized data clumnping. New projects include applications in interactive installations and distributed performnance environmnents. Global String (fig. 3) is a networ~k based mnusical inlstrumentllt with physical compoinents based at eachi enld pointr ' Ser er~ base~d audio sy~nthesis is; aff~ctedc by! visitor data fr~om sensor's, anrd is screamerid ba7ck to each site. A~ separate cha~nnlel imlementILn S IP based '~ideoc~on f'ere~ncing~ and1 da~ta visualization. NUetradio Feedlback is a1 share~t d performlance: space. oruanized like a distribute~d radio studio..-\udio ha~rvested fromll radcio broadcasts and wecr~lawvlers are remllixed anfd r~e-transmit b~etweenl sites as well aIs to, thel webspace at large. Letssonr s culled tionii thle previouls concerts alre be~ing applied inr these nlew projects. References I. GNUsic: An Open Studio oil the Network ror* Electi~roic Musicians. http://www.gnus ic. net 2. Recom mendaltion 1-1320 (07/97) - Narriowband visual telephone systems and terminal equripment. http://www. itu. i n/icudoc/itult/rec/h/h3 20. htmlI 3. ~tviller Puckerte. David Zicarelli. MAXY - An Interactive Grrtph ic Program nri ig Envir~onment. Opcode Systems. 1990. 4. David Zicarelli. The W' Protolrcol: AI System ror~ Collabo~rative MAX~, Paitches ont the Internet, 1996. 5. Shouccast Streaming Server http://wwvw.shoutcast. com/info. hIm I 6. Internet Relay Chat http://www. newircusers. com 7. Atau Tanlaka. Netmusic - a1 Perspective. Catalogue, Festival du Web, Webbar, Paris. 1999. 8. Finding Time hrrp://www. turbulence.org/Works/f'time/ 9. Vibeke Sorensen, Nifiller Puckette, Rand Sceiger. Global Visual-Music. hctp:llwww. visualm usic. orgZ/vm/g:vmjs. htm 10. A~ES Technology Report TC-NAS 98/1: Networ~king, Audio, and M~usic Using Intlern et2 and NVext-Generationr Internet Capabilities. 1998. I I. Atau Tanaka. Networ~k Music Labor-atoty Research Project Repor~t. Keto University internal document. 1999. 12,. Atau Tanaka, Kasper ~oeplitz. Global Str~ing hctp://po. ntticc. or.j p/-sband/aca ulstri ng Acknowledg~ements of~ PictureTel, Fujita Sanken, France Telecom Japan - Keio University SFC. IAMAIS. Cyberia. CyberOz, M~ilk Euroape - Webbar. Atheneuni, IECA Festival Aye Aye, TransEurope Halles, V2,, Sonar, Art Fucura U.S.A. - HERE. Dumbo Canada - Centre N1Cduse, Studio 303. FCMMh I'ariwicrjuiiigS I~i~~ L es Virtual is ies. Pita, Farm ers M~anual. Sensorband. Camel Zekri. Sheldon Steiger. Koji Asano, Gil K, Ka3sper Toeplitz. Srnrbait, M~asami Sakaide, Kioichi Mlakiganii, Zack Sertel, BuyoBuyo Igor. Tonis Nlays. Erik M~ontgomlery, Eve Beglarian, Danny Blume, Takahiko Suz~uki, Kenji Yasaka. Shunichiro Okada. Tito. - 5212 - - 522 -ICMIC Proceedings 1Y99