Page  00000001 Pure Dnata Miller S. Puckette Department of Music, UCSD; http: // mpuckett/ Abstract A rtew software system, called Pd (""Pure Data"), is art attempt to update the Max paradigm to address certairt irtterestirtg developments of the last decade. These irtclude hardware artd software platform chartges, artd also rtew artistic imperatives, ill particular the combirtatiort of image artd sourtd usirtg computers. Also, the treatment of data structures (a weak polrtt ill Max) is reworked. 1 Intro duct ion Irt the tell years that the Max program has beert arourtd [1], it has become clear that it is much better at describirtg process thart data. All sorts of MIDI filters are easy to build (artd if you're usirtg Max/FTS [2], all sorts of synthesizer patches as well.) But for collectiorts of data, Max offers only the rather straight-jacketed table, coil, explode (rtow outdone on the Macintosh by David Zicarelli's dynamite), artd qlist. On the Macintosh we also get timeline, which is the closest thirtg Max offers to a rtotiort of score. But artyorte tryirtg to use Max to implement Zertakis's UPIC (for example) will be disappointed by Max's data-hartdlirtg shortcomirtgs. The Pd computer program, described here, provides the mairt features of Max (irtcludirtg sigrtal processirtg ala Max/FTS), but is also intertded to support the defirtitiort artd editirtg of compourtd data structures irt a more sophisticated way thart Max does. Possible data structures are: lists of Max-style messages; collectiorts of breakpoirtt ertvelopes; tables of vectors. Collectiorts of data artd Max-style objects are presertted irt a urtified ""cartvas" wirtdow; the data collectiorts rteed rtot be relegated to sub-wirtdows but cart appear as part of the patch. Pd provides rtear compatibility with Max/FTS both for patches artd for exterrts; but certairt features of the older program arert't adhered to slavishly, artd certairt rtames are chartged to prevertt cortfusiort. Irt additiort to providirtg audio syrtthesis artd processirtg capabilities, Pd plays host to the GEM ertvirortmertt [3], which puts image artd 3-D graphics processirtg irt a rtatural, ""patchable" framework similar to the way Pd treats audio sigrtals. 1.1 The changing situation Both the artistic imperatives artd the techrtical mearts of the 90s differ from those of the S0s whert Max was cortceived. Irt late 1987, the state of the art irt live computer music was Giuseppe di Giugrto's 4X machirte, while the state of the art irt programmirtg ertvirortmertts for irtteractive music was the Macirttosh rurtrirtg system 4.3. Max started out as a tool to drive the 4X over MIDI from a Macirttosh. These days, arty serious attempt at providirtg a real-time computer music performartce ertvirortmertt must permit direct computer martipulatiort of sourtds, artd rtot just rely ort computer corttrol of MIDI gear. At the time the 4X artd its successor the JSPW~ were beirtg desigrted, it was rtecessary to build multiprocessirtg hardware to get to arty level of performartce that might be cortsidered suitable for real-time audio syrtthesis artd sigrtal processirtg. These days, high-performartce workstatiorts have almost reached the speed of the JSPW. Off-the-shelf audio I/O hardware is also approachirtg the 4X/ISPW level. The software implicatiorts for this chartge are ertormous. Irtstead of splittirtg a piece of software irtto a realtime ""executive" corttrolled by a higher-level ""GUI, it rtow becomes possible to put the editirtg artd file hartdlirtg furtctiorts irt the same address space artd thread as the real-time computatiorts. Max/FTS was also desigrted before it became possible to irttegrate video, graphics, artd audio processirtg irt a sirtgle real-time software ertvirortmertt. This has rtow become possible, artd composers artd visual artists are irtcreasirtgly drawrt to the rtew possibilities raised by closely-coupled audio artd visual computirtg. Firtally, Max's audio processirtg capabilities were desigrted at a time whert most real-time audio pro

Page  00000002 cessing for computer music was done in the time domain. These days, cutting-edge acts like the Convolution Brothers are operating increasingly in the frequency domain. There is also increased interest in real-time audio analysis for "capturing gestures" which could be used to control a variety of aspects of real-time audio or visual processing; this is hard to do in Max because of its poor menu of data structures and data manipulation possibilities. 1.2 Fixable shortcomings of Max Many kinds of well-justified criticisms have been leveled at Max; see, for example, [4]. Max's worst deficiency, as mentioned earlier, is its treatment of data. Most programming environments provide notions of data structure, array, and list (or else pointers you can use to build lists as in C). There is almost always a notion of variable as well; unless you count the "value" object, Max has no such facility at all. The time-domain nature of Max/FTS's audio processing is reflected in its limited options for variations in block sizes and sample rates. As a result, the fft- object in Max/FTS is very clumsy to use; its outputs appear as time-domain signals which hold spectra lined up end to end. Not only does this make it hard to get random access to the data, but it makes it awkward to specify time-overlapped FFTs. Finally, in Max/FTS, it's hard to get information back and forth between the audio signal domain and the control domain. Controls can affect signals via the sig- and line- objects, and signals can generate controls via objects like snapshot-, but a much higher level of integration would be desirable. 1.3 Unfixable shortcomings of Max The second worst thing about Max is the order-ofexecution business. A decade since Max's emergence, it is still unclear how to define a "correct" order in which messages should be sent to objects, nor how to make a true dataflow language carry out Max's functions well. So the trigger object is still with us, at least for the moment. A manifestation of this problem is that audio and control computations in Max (and in Pd) obey different semantics, because no known semantic seems able to handle both continuously running and sporadic processes well. 2 Design The rationale behind Pd's design is to make it easy to pile up large collections of data, some of which might be obtained through analysis of audio signals Figure 1: Changing the RT/NRT boundary. (and some of which could be audio signals), which could be regarded as a computer music score to be "played" by a Max-like patch that traverses it. Pd resembles Max/FTS on the surface, but its design reflects Pd's fundamental concern with hierarchical, user-definable data structures. In the Max/FTS system, since the real-time computations take place on a multiprocessor, the "patch editing" portion of the system is placed on a different machine (the "host") from the real-time environment. (To do otherwise would require that the editor be a distributed program, which would be too cumbersome.) As a result, in Max/FTS, all data structures have to live in two copies, one "upstairs," the other "down." A variety of ad-hoc (and unsatisfying) mechanisms were put in place to try to keep coherent copies of tables, "explodes," etc., which can be altered either by editing upstairs, or algorithmically downstairs. Since Pd is only designed to work in a uniprocessing environment, it is feasible to put the editor in the real-time environment. Figure 1 shows the different relationships between the real-time and non-realtime components of Max/FTS and Pd. Max and FTS communicate though a protocol too complex to describe here. In the case of Pd, the "window" layer

Page  00000003 ethel message--> hello world 0.0 <--atom object--> print bang stop (no toggle yet) metro 12.67 trigger bb timer Figure 2: Max-like stuff handles drawing requests and sends slightly cooked window-system events directly to Pd for handling. fred - 0 - -100 100 0 10 get ethel y 0.809017 50 pd y.pd template 2.1 The Pd document The Pd document appears as a collection of embedded canvases much like Max's patcher windows, but holding a more general assortment of items. These will eventually include: * Patchable objects as in Max/FTS; * A generalized notion of array to replace and unify Max/FTS's table, tabl~ and table" objects; and * Dynamic, heterogeneous lists of objects living in two-dimensional coordinate spaces, replacing and generalizing explode. The Max-like patchable objects get a face lift; see Figure 2. Certain interesting design changes reflect new ideas in scheduler design reported by Dannenberg [5]; note the floating-point representation of time. An example showing the array facility is found in Figure 3. Here, two arrays, Fred and Ethel, live in the domain [0,100] and the range [-1,1]; they can be accessed by name. Elements of arrays are not required to be numbers but may be data structures defined by templates; the box "pd y.pd" is providing a template which says that each element of each array contains one floating-point number. As to the heterogeneous lists, some progress has been made toward implementing the ideas described in an earlier paper on Pd [6], but the design is still changing constantly. List management will probably Figure 3: Arrays in Pd. emerge as the most distinctive feature of Pd as opposed to Max, but may develop over a time frame similar to that of Max as well. The central idea is to be able to use a Pd "patch" as a definition of a data type associated with a user-definable drawing. (There is sime similarity between this idea and that of Animal [7].) An instance of the new data type would be like an "event" in a sequence. The events would be allowed to contain not only scalars like a Max message, but also arrays and sublists of events in turn. Max has irked some computer scientists because of its lack of scoping. In effect, everything in Max that has a name shares one global namespace. Pd will not introduce multiple namespaces, but will offer a degree of locality by providing objects that manage traversal of Pd-defined data structures, thereby relieving the pressure to name everything in the world. 3 Current situation Pd is being developed as a cross-platform program, which is currently targeted for both SGI IRIX and Windows/NT (these "happen to be" the two platforms Dannenberg recommends). Pd (which also stands for "public domain") is freely distributed with source code. Releases are available via FTP from pub/msp on crca-ftp.ucsd. edu. At this writing the

Page  00000004 user base is quite small. No promise is yet made as to future compatibility of versions of Pd; the design is unlikely to be frozen for another year at least. 3.1 Implementation To shield the source code from window system pollution, all GUI code is implemented in John Ousterhout's TK toolkit [8], which is currently available for X, Windows, and MacOS. Pd's implementation uses each platform's "natural" audio system, which in both IRIX and NT appears to be easily expandable to eight or more channels. Ditto, in principle at least, for MIDI. 3.2 Platform stability The only software component that Pd depends on in any fundamental way is the TK toolkit, which looks stable in the medium term at least; if TK were ever to turn sour, a window interface retrofit (of the sort that moved Max/FTS from NeXTStep to X/Motif) would be needed. The GEM system also depends on Open GL, which looks quite stable. More frequent, but much less central, adjustments will have to be made to Pd to track changes in audio, MIDI, and OS interfaces. 3.3 First time out Current development in Pd is focussed on its premiere use in a live performance, which will take place during ICMC97. The piece, named Lemma One, is a collaboration between computer artist Vibeke Sorensen, composer Rand Steiger, and the author, with the participation of George Lewis and Mark Danks. Lemma One is an improvised piece involving both computer audio and computer graphics, happening simultaneously at two locations (one of them Thessaloniki). The main artistic motivation for Lemma One is to explore the idea of using analysis data from one process to affect another one, either separated by distance, or else across the boundary between two media. Each site will have two improvisers, a percussionist and a brass instrument player. Both their audio output and their images will be analyzed by computers (we'll use separate Intel machines for the audio and video processing for practical reasons.) The analysis output will be low enough in bandwidth to fit on an ISDN connection between the two sites, so that the computers in both places can have access to the same control information. So for example we could have each percussionist play with a figuratively reconstructed echo of the other one; or we could show locally and remotely sound-controlled images simultaneously and hear only the local players; or we could have each site's camera input control the spatialization of sound on the remote site; or whatnot. 4 Acknowledgements Lemma One is made possible by a generous grant from the Intel Corporation. Pd's development has received support both from Intel and from SGI. References [1] Puckette, M., 1988. "The Patcher." Proc. ICMC, Cologne, pp. 420-429. [2] Puckette, M., 1991. "Combining Event and Signal Processing in the MAX Graphical Programming Environment." Computer Music Journal 15(3): pp. 68-77. [3] Danks, M., 1997. "Real-time image and video processing in GEM." Proc. ICMC, Thessaloniki. [4] Lewis, G. E., 1992. "A Max Forum." Array 13/1, pp. 19-20. [5] Dannenberg, R., and Brandt, 1996. "A Flexible Real-Time Software Synthesis System." Proc. ICMC, Hong Kong, pp. 270-273. [6] Puckette, M., 1996. "Pure Data: another integrated computer music environment." Proc. the Second Intercollege Computer Music Concerts, Tachikawa, pp. 37-41. Reprinted as ~/pub/msp/ [7] Lindemann, E., 1991. "ANIMAL-a Rapid Prototyping Environment for Computer Music Systems." Computer Music Journal 15(3): pp. 78 -100. [8] Ousterhout, J. K., 1994. Tcl and the Tk toolkit. Reading, Massachusetts: Addison-Wesley.