Page  432 ï~~A Real-Time Analysis and Resynthesis Instrument for Transformation of Sounds in the Frequency Domain Todor Todoroff Facultd Polytechnique de Mons Abstract We have programmed several external MAX/FTS modules running on the ISPW-16. They perform FFT analysis and resynthesis and allow different kinds of transformations in the frequency domain: denoising and transformation through thresholding on the frequency bins, transposition and pitch shifting, very flexible filtering with real-time interpolation between several transfer functions and cross-synthesis with real-time control on the cepstral smoothing of the spectral envelope of an auxiliary signal. All those transformations may be combined and most of the parameters values may be accessed in real time. This allows complete control over the creation of complex sound morphologies. 1 Introduction There are several programs performing sound transformations through FFT analysis and resynthesis. Programs like IRCAM SVP [Depalle and Poirot, 1991] [Serra, 1993] and AudioSculpt produce high quality sounds with a high degree of versatility. But we wanted more than just tools to process sounds: we wanted a real instrument that could be played in real time with full gestural control over the transformation parameters. We designed our instrument primarily in order to allow composers of acousmatic music to produce complex sound morphologies with constant audio feedback. But this goal also matches the needs of composers who wish to perform live transformation of instrumental sounds. Our external modules where used by Emmanuel Nunes for the creation of his last work,"Lichtung 2", for instruments and two NeXTCubes fitted each with 3 ISPW-16 boards. The implementation was done by Eric Daubresse, musical assistant at IRCAM. Rather than using standard MAX modules as proposed by Settel and Lippe [1994], we choose to write external objects in order to be able to implement more functionalities. By combining several modules in one single external object we significantly reduce delays normally introduced in the signal flow when transferring signals between modules. And the considerable amount of processing time usually wasted in control objects may be used for other purposes: for example, writing an external object to interpolate between several large sets of data allows a kind control that would produce dac slips if we had used dozens of add and multiply objects. On the other hand, we didn't want to sacrifice the modularity which is without any doubt one of the great features of MAX. We therefore designed three modules: ansynth- actually performs the analysis and reynthesis, while cepstralsmooth- extracts the cepstrally smoothed envelope of an auxiliary signal and gabarit- generates an amplitude filtering transfer function with a very flexible control. The output of those last two modules may be combined in any way prior to sending them to the main module. Figure 1 depicts schematically how one may combine those three modules. 1 analysis-resynthesis.pat. /Usersftldor/Max/an_synth XJ (rmessages.t o _cepstrrai smooth t acic-" 2_ Auxiliary audio input:epatral smooth- 1024,.J argi:'FF1'size r messages togburit Iag"F'"sz mesae to -nyt gazit- 1024 I 128 4 arg2: number ofl(f. A) points ad-I Muir audio input f --' arqj4: number of interpolation presets ansynth- 1024 256 0 arg2" number of samples in between successive FF~s arzg3: window type Figure 1: Connecting the external modules together. Todoroff 432 ICMC Proceedings 1996

Page  433 ï~~2 The ansynth~ module This main module implements an overlap-add analysis and resynthesis. The FFT sizes, up to 1024 points, and the overlapping factor, up to 75 percent, are defined by the creation arguments of the object. The window-type, as well as all the other parameters, may be modified in real-time by sending the right message to the module and chosen between 8 of the most common ones (different windows may produce different timbers when filtering is used). Three kinds of simultaneous availiable frequency-domain transformations are proposed: 2.1 Thresholds: a lower threshold may be applied and allows for efficient denoising of signals. The threshold control may be added to a frequency-dependent threshold drawn by the user in a MAX table. The frequency bins falling below the threshold may either be suppressed, attenuated or reinforced. When increasing the threshold while suppressing the components matching the test, the signal is first denoised, then transformed up to a point when only the resonances are heard. One might describe the effect as a kind of liquefaction of the sound. An upper threshold function may be used simultaneously and offers the same controls. The effect is more difficult to describe as it is more dependent on the sound being transformed. If, for example, it contains a combination of steady and percussive sounds, the latter may be increased in level or attenuated hardly without modifying the steady sounds. 2.2 Spectrum multiplication: the second signal input may be used to multiply the real and imaginary parts of each frequency bins by an external signal. cepstral_smooth- and gabarit- where specially designed for this purpose, but other modules may be used as well. 2.3 Frequency axis modifications: we were not able to implement phase extraction as we needed to be able to run everything on one single ISPW-16 card. But we implemented a simple algorithm to perform transpositions, pitch-shifting and non-linear distortions of the frequency axis. Those transformations are far from perfect, but may nevertheless produce very interesting sounds and create various kinds of inharmonicities. The results are specially interesting when controlling the parameters with line modules with a grain of 4, which corresponds to the number of cycles between successive analysis and synthesis windows when using 1024 points FFTs with 75 percent overlapping. 3 The cepstral_snooth~' module This module performs an FFT analysis of its signal input. It then performs the FFT of its log magnitude spectrum, the cepstrum. After having zeroed a number of cepstral coefficients according to the messages sent at the input of the module, the inverse FFT is computed, the values are converted back to a linear scale and sent to the output. Combined with ansynth-, cepstral_smooth- produces very high quality vocoder-type effects. With its 512 filter bands, compared to 10 to 20 bands in commercially available analogue vocoders, it preserves remarkably the intelligibility of the voice and is even able to superimpose its harmonic structure on the other signal. The real-time control over the smoothing parameter is something that we had never seen or heard before. It proves to be very effective. d gabs - /Users/tndnrlMaxhvsyntholO8 x Jj gabh Users l*ndrr1MaxIan synthOtOfl __ _ _ _ __ _ _ _ __ _ _ I 1!i gab2 - /Users/todor/Max/ansynUOloe J J gabo - /Users/todor/Maxian syntholOlXj ga- 1Useru1anr1Maxhan.synUtoi XI 1"J gb7. Itsannorl~ax',nynthitoflX "7,- ':.. " "..,.'.r. Figure 2: Cepstral smoothing, transposition and frequency-shifting. ICMC Proceedings 1996 433 Todoroff

Page  434 ï~~We also added the possibility to freeze one analysis frame, as well as to transpose and frequency-shift the spectrum. The tables on the left of figure 2 show one frozen spectrum of a voice signal where the smoothing was increased from top to bottom. The tables on the right show the last spectrum envelope transposed, frequencyshifted and passed through a combination of both. All those values may be changes in real-time. 4 The gabarit' module This module offers the most complex and flexible control. It is designed to generate filters in combination with an_synth--. Filters may be defined with up to 128 points, 200 of them may be stored in or recalled from memory presets, those banks of presets may be saved to or load from disk and it is possible to interpolate in real-time between up to eight different presets. Figure 3 shows one possible MAX patch to-control and display the values of this module. DJ _ _ IEO gaba1t_proc_t EQ 16 BANDS AMPLITUDE O ~~1nq mp e 9e [dB I-90 -18 -6 -31 -56 -62 -4 12 -8 -30 -38 -17 - 4 -38 8IN FR-E QUE NCE27 e s to 11 12 3 4 Is is [HzJ 3 125 15 17 50 12 406 00 25 81 1000 1250 1593 000 531 187 1 111,t ePRESETS p,,,,he,,,, = t o,,che...r ou,,,st... I I'; n 57 ' n -IT- rT-!'OI' -Ir'2I13 4 I S 6II I 3 modifypeeset interpol type Sode In treta mode p.,er copy reet tos et - 1 4 palcher erode E i0 0J _...h i o!l _ 1, eselect OEb_ _ _I 17 19 19 2 I 32 3 Interpoltonipresets 2 23 456 s79 pack 00 O I:] 2infretpolpeat =1 92 seeset mode S1 o"8P, 1" 2.9rlpt tog Itlt9pOtype =tSI erpoionsmd AVE LOAD, r.. Figure 3: MAX patch to control a gabarit- module with 16 (f, A) points, 32 presets and 8 filter interpolation. The process of creating a transfer function is based on the definition of (frequency, amplitude) points. This might be done by moving virtual faders, through number boxes, message boxes or with the help of any MIDI controller. Frequencies may be entered either in Hz or by the number of the corresponding FFT point. When using gabarit-r with a 1024 point FFT ansynth-. module, the values are updated every 512 samples. The user may switch at any moment to one out of three different modes of using those points to generate a transfer function: making either a bandpass filter by drawing horizontal lines, or a breakpoint filter by linking the points with straight lines in a linear or logarithmic scale. Figure 4 shows the results for those modes on the left side. Sound morphologies are easy to control when using MIDI faders assigned to the amplitude and/or frequencies of the points used to define the transfer function. But even more interesting results are obtained when performing interpolation between several presets. Two kinds of interpolation have been implemented. The first one computes the transfer function for every preset chosen and then adds them after multiplication by their respective weights (see top right table in Fig. 4). Those weights might be generated with the mouse using a program like developed by Gerhard Eckel at IRCAM or with our interpolation object [Todoroff and Traube, 1996], or with any other mean. The second kind of interpolation doesn't interpolate on the transfer functions, but directly on the definition points. A new set of interpolated points is then created and the user may choose between the three modes already described to construct the transfer functions. This form of interpolation leads not only to fades in and out of transfer functions, but also to the points frequencies being interpolated, creating spectral glissandi (quite noticeable in bottom right table in Fig. 4). Even while interpolating, it is always possible to modify any or all of the presets being interpolated, as well as changing the values of any point of those presets, their modes or the kind of interpolation. Banks of presets may be saved to disk, allowing one to construct a library of transfer functions. And, as it is also possible to load small banks of presets into larger ones and to copy presets from one location to another, one can make a new bank by taking presets from different banks already saved to disk. Todoroff 434 ICMC Proceedings 1996

Page  435 ï~~gabl - IUserstodo/M1haxtansynhOOl J f I - /Users/tndor/MaxMnsynmoto 10, ga.2 - /Usersttodor/Max/uisynthOi 06.J J gabs./UseroAdor/Maxtan synt0OK X6 gab3- IUssrstlndor/.4axhansynthlotO XJ 1 gah7.-/Usrs/tndorIMaxan.synhOO6 Â~ \ / Figure 4: The three modes of gabarit- and interpolation results. 5 Summary We have designed a set of modules performing various kinds of transformations of sound signals. As those transformations are introduced in the frequency domain, they are very intuitive to work with. Thanks to realtime and simultaneous access to every parameter, the user may quickly experiment various settings. And the interpolation facilities of the gabarit'- module are an invitation to experiment with complex and yet precise sound morphologies, either for composing tape music or to transform live instrument sounds in concert. This real-time control also allows many kinds of interaction with a sequencer or with a MAX patch, making it a useful tool to experiment complex dynamic timber modulations. We are currently experimenting different types of parameter mapping to control this instrument with MIDI faders and a PowerGlove interfaced through the STEIM SensorLab. References [Allen and Rabiner, 1977] J. B. Allen & Lawrence R. Rabiner, A Unified Approach to Short-Time Fourier Analysis and Synthesis, Proceedings of the IEEE, Vol. 65, No. 11, Novembre 1977, pp. 1558-1564. [Depalle and Poirot, 1991] Philippe Depalle & G. poirot, SVP: A Modular System for Analysis, Processing and Synthesis of Sound Signals, Proc. ICMC 1991, pp. 161-164. [Griffin and Lim, 1984] W. Daniel & Jae S. Lim, Signal Estimation from Modified Short-Time Fourier Transform, IEEE Trans. ASSP, Vol. 32, No. 2, Avril 1984, pp. 236-243. [Portnoff, 1976] Michael R. Portnoff, Implementation of the Digital Phase Vocoder Using the Fast Fourier Transform, IEEE Trans. ASSP, Vol. 24, No. 3, Juin 1976, pp. 243-248. [Portnoff, 1980] Michael R. Portnoff, Time-Frequency Representation of Digital Signals and Systems Based on sort-Tm or ieAnlyseiser IEEE cTrasng. ASP mVol. 2o tr1,sf r l0 p. 5-9.o [serra, 1993] Maxpri-eene Sopea SVy ntrc tione +odlSVP. aree Dcumnl xentainiAM Octore 1993, [Settel and Laipe, 1994] Zack. Setten & CowrtLe, R aie MUicald Appliahto St-ing FTubaed ReasidSynthesis, ProceedingsC Aarh 1994. l 5 o.1,Nvebe177 p 55-54 [TDoroff, 1995 oor Todoroffipp Instruent de torasmtonS pouar anyse sy nhyseis, MaxcTsin, Sthsims ofoundeSinfatiquoc IMucal JM1, 96p 199-6,.1310 [Toroff and Trau, 196] TDorie Toorfand CaroSigne Etabe, GrphicalfNeTTE SobetsTia FTSie Clinttnof Contro Minstruent in Pothnew FTSplient/Servoter Arcitectr Praeocedring ICMC Hong onge 1996.,IEETas ASVl 2,N.3 Ji 96 p.2328 This research has been funded by the Region Wallonne in Belgium. ICMC Proceedings 1996 435 Todoroff