Page  1 ï~~TAPPING INTO THE INTERNET AS AN ACOUSTICAL AND MUSICAL MEDIUM Chris Chafe Center for Computer Research in Music and Acoustics (CCRMA) Stanford University The fantastic world described by Francis Bacon in his New Atlantis imagined elements that are one and the same with computer music developments of the last half century. "We have also sound-houses, where we practise and demonstrate all sounds and their generation. We have harmony which you have not, of quarter-sounds and lesser slides of sounds. Divers instruments of music likewise to you unknown,..." [1](Bacon, 1626) The passage's closing line forecasts worldwide sound: "We have all means to convey sounds in trunks and pipes, in strange lines and distances." Electronics first realized this prediction with analog telephony and then radio over a century ago. But from our present vantage point of rapidly expanding digital networks and media streaming, a compelling new reading emerges promising something much more. Internet2, Geant2, Cernet2, and their peers are scaling up to astonishing capacities. Where demonstrated realtime, interactive, uncompressed flows are currently in the "centi-flow" range for high-definition audio and music collaboration, we can expect (soon) to significantly reinterpret Bacon's vision with the emergence of a new medium that boasts several orders of magnitude greater number of interactive channels. Music performance in prototype "acoustical chat rooms" already achieves a sense of co-location beyond traditional teleconferencing. Improved collaboration schemes will transcend the present state-of-the-art from "almost like being there" to "better than being there." New forms of presence will couple upgrades in raw network power and media fidelity with research in perception, synthesis and prediction. Take the medium itself: just like in air, sound waves traveling between hosts on the Internet can bounce off edges, boundaries and obstacles. These reflections give rise to a configurable sound world of rooms with enclosing walls which contain networked and network objects that vibrate and produce sound. The world is entered from anywhere in the physical world connecting with a high-enough speed Internet connection. The relatively short time delays across next-generation internets regionally make the apparent acoustical separation musically plausible. These path delays themselves can also be used to constitute network sound objects in a new breed of synthetic, distributed musical instruments. Recirculating echoes are used to create instrument tones whose pitches are in the musical range if fast enough. One can, in fact, "play the network" as a waveguide instrument simulation stretched between San Francisco and Los Angeles and obtain a mid-range pitch. One application uses these tones to monitor the quality of a network connection intuitively by ear. Just as someone might clap to get a sense of the size of a darkened room or knock on an object to know its rigidity, network users can sound their connections and listen to the vibrations that result. By plucking a "network guitar," the network's quality of service (QoS) becomes audible. Network delay itself provides the string's "memory" in a physical model whose pitch is a function of the time between two sites. The longer the sound takes to make it back, the lower the resulting pitch. And the more constant the tone, the better the QoS and the closer it is to ideal. An "audio ping" in this form monitors QoS at a finer granularity than the traditional network "ping" utility and in real time. The tones often exhibit an unusual pitch wavering due to changes in the speed of sound over the network. Sound propagation in the network differs from sound in air, along stretched strings or through other familiar media. Among its unique aspects are jittery arrival times of sound packet data and speed asymmetries in opposite directions over a given path. Where in physical media, distance-related delay affects signal intensity, spectrum and other qualities, in the Internet the sound remains the same even having traveled around the planet. These differences are significant for behavior in musical performance. The analogy that comes closest is from our experience with underwater acoustics. Entering into these different sound worlds with our ears, the properties of water or Internet media give them a sonic imprint all their own. We know very well the sound of the former and may soon become familiar with the latter. The slight discrepancies in "now" that result from delayed sound within a distributed ensemble are significant for the choice of music. Synchronized rhythms generally de-synchronize as delay increases. One reason is the ambiguity of a perceived deacceleration in the music. Depending on whether slowing tempo is an expressive inflection or an increase in network delay, a player reacts differently. We can observe the difficulties and formulate theories of what will work and what won't, but the behavior we expect in practical situations isn't always what we get. For reasons not yet apparent, ensembles sometimes adapt to delays that should be impossibly restrictive for a particular

Page  2 ï~~music. Trying to explain ensemble behavior only from the mindset of network engineering we find that a mechanical concept of "now-ness" is insufficient and something quite apart from a musician's perceptions. Raising questions about the human factor has become an important by-product of current online explorations. Similar to R&D in music synthesis, paradoxes arise which frame questions related to perception. Attempts to replicate a sound via acoustical analysis can uncover new principles of how we actually hear. Music made with distributed music ensembles and controlled lab experiments both have yielded paradoxical results that prompt new questions relating to time in performance and ensemble "production." These theories lead back to early work in time perception, especially phenomenology, and forward into advanced languages for media computing. What aspects are needed to model ensemble behavior and could such a model be used to engineer schedulers or artificial "music agents" that implement our understanding of human temporal experience? Temporal order and event salience would be the first layer of such a model. Husserl investigated this with phenomenological techniques of introspection, but " soon as we make the attempt to undertake an analysis of pure subjective time-consciousness -- the phenomenological content of lived experiences of time -- we are involved in the most extraordinary difficulties, contradictions, and entanglements..." [2]. A memory-less agent has no such difficulties, it is reactive only. But a useless model. The "pure now" approach under-performs compared to human rhythm tracking [3]. A 300 msec. integration interval serves in some theories as a window of "now-ness" which can be treated "as a space within time itself." Two listener / performers coupled together but with network delay intervening respond in ways which may help shed light on the bigger questions: "...the two facets... of time consciousness: on the one hand, the rich texture of the present, and on the other hand, the multi-scalar hierarchy of temporal registers that underlies the flow of time." [4] The acoustical qualities of the Internet change music because it changes the physics, but also refract differently our micro-time human abilities in a way that helps us understand them. Other factors certainly influence the picture -- reverberation, ensemble strategy, phrasing, aspects of style -- all come into play when learning why some things work and some things don't. How close are we to adopting / adapting to this new medium? This may seem a bit of a stretch at this point in the game, but perhaps there will come a time when it seems less usual and even a bit special to congregate for music face-to-face. Music making will take place increasingly in the new medium because general trends in communication run towards lower energy expenditure, higher content. Networked music performance does reduce travel and does seem poised to raise the "channel content" (if we consider one's daily musical life as a channel). Similar to how email had already infested the early years of the Stanford AI Lab (CCRMA's first home). "Electronic mail" was generally unknown in the 70's but we were a small group who had access to such facilities. I remember a local newspaper story featuring oddities of lab life where the reporter noted how its denizens had an addiction to reading electronic mail first thing in the morning. Potentially, in the same way in which email has become a global energy-saving, content increasing phenomenon, present network performance experiences are also a harbinger of change to come. After having worked in "multi-site mode" for the past month for several concerts, there was indeed a shift in returning to an entirely non-network event ("hey, look -- we're all in the same room"). Concerning one recent concert a critic wrote, "... what happened Tuesday night was about more than the music. It raised basic questions: What does it mean to 'be here,' when here is there, and there is here?" For that event, we were musicians and audiences co-located in Stanford and Beijing 6,000 miles apart. 1. REFERENCES [1] Bacon, F. New Atlantis, Kessinger Press, Whitefish, MT, 1626. [2] Husserl, E. "The Lectures on Internal Time Consciousness from the Year 1905" translated by Churchill, J. in Shorter Works, ed. McCormick, P. and Elliston, D.U. of Notre Dame Press, Notre Dame, IN, 1981. [3] Gurevich, M. et al. "Simulation of Networked Ensemble Performance with Varying Time Delays: Characterization of Ensemble Accuracy", Proc. Intl. Computer Music Conf, Miami, 2004. [4] Hansen, M. New Philosophy for New Media, MIT Press Cambridge, MA, 2004.