Page  00000001 Toward Construction of a Timbre Theory for Music Composition Naotoshi Osaka School of Engineering, Tokyo Denki University osaka@im.dendai.ac.jp Abstract Since the 20th century, timbre has become an important factor in music composition. However, no effective timbre theory for music composition has been yet developed. In this paper, firstly standpoints are described for previous timbre theories. Then the requirements for a new timbre theory are discussed: 1) How timbre should be categorized, 2) hierarchical structure and its self-similar representation, 3) extension of previous timbre (music) representation, 4) discrimination of embedded and exposed structure, and 5) definition of operation and grammatical structure. 1 Introduction In most music, musical pitch is used as the basis of music structure. Based on musical pitch, scale, tuning, melody and chord structure are defined. Harmony and counterpoint are higher musical theories based on these notions. In comparison with these pitch based musical theories, theories concerning rhythm or timbre are not structural. Orchestration is a collection of orchestral score examples and discusses timbre effect of these examples, and does not discuss the timbre system itself. On the other hand, engineers have been developing various types of sound synthesis methods, but have not spent much energy in theorizing about timbre. As an object of psycho-acoustic research, timbre is used for analysis, as seen in a discussion of abstract axes in psycho-timbre space. Timbre relevant research outputs have not been used as diagnostic information for sound synthesis technology. Analysis of timbre has been focused on traditional western musical instruments, and electronic sounds or sounds with musical effects have not been investigated. Another important fact is that static timbre is of interest in this field, and this research lacks the insight of temporal structure. This does not serve as a musical theory for composition. The desire for a musical theory on timbre must be that of musicians. By establishing a timbre theory, newer music expression might be possible, and many electronic pieces born after music concrete can be analyzed more theoretically. The improvement of computer performance would accelerate this movement. In confronting such a problem, this paper describes the requirements of a timbre theory for music composition as a first step. 2 Previous Timbre Theory for Music Composition Study Pierre Schaeffer, an originator of musique concrete, has proposed timbre theory for music composition[l]. His theory is been further developed INA/GRM (L'institut National de 1' Audiovisuel/Groupe de Recherches Musicales). This theory covers multi-channel spatial music, acoustic design and soundscape. This theory is popular in France. However, it is not referred to often outside since there is no literature in English. Another reason is that this theory is rather aesthetic, and is not directly linked to engineering formulation and experimental psycho-acoustics. 3 Sound Material as New Timbre and its Synthesis What kinds of timbre are the objects of the theory? Electronic sound was acquired a half century ago. Since then digital technologies have developed and precise control of a sound in milliseconds has become possible. In these circumstances, various effects have been introduced: echo, reverberation, flanging, chorus, vocoder, harmonizer, etc. These processes add distortions or other effects to the original sounds. The author observes large scale musical performances on the following two effects: 1. Sound morphing[2] 2. Sound hybridization[3] We have studied sound morphing algorithms[2],[4]. Sound morphing is a perceptual interpolation of two given sounds of different timbre. The purpose of this technology is to create a rich timbre between two distinct sounds. Sound hybridization is technology which synthesizes a sound stream from various perceptual features from different sounds. Talking orchestra and the singing voice of animals are good examples of this technology. Proceedings ICMC 2004

Page  00000002 3.1 Rediscovery of Timbre Material: Asian Sound and Ancient Sounds Sounds discussed so far are either those which do not exist in natural environment or synthesized sound giving the impression of natural acoustic sounds. Can't we find other acoustical sounds? Yes, we can count various Asian instrumental sounds such as Shakuhachi and Koto. Gagaku instruments such as hichiriki and sho give us ancient and historical timbre. These are not only characteristic in static timbre, but various performances are also characteristic. We have a large vocabulary of singing voices. Vocal sound in pop music with individual voice quality, Hoomei and general throat singing of Mongolia which has double pitch, Bulgarian singing in general, as well as Bercanto, are targets of timbre theory being considered here. 4 Requirements for Timbre Theory What are the requirements of a timbre theory for music composition? The theory should also define time series in some way or other. It is not sufficient to only categorize timbres to establish theory for temporal arts or pieces. As in a harmonic cadence, it is necessary to define time series using symbols. There are five points required for the timbre theory: 1. symbolic classification of timbre 2. Hierarchical structure and its self-similar representation 3. Extension of previous timbre (music) representation 4. Discrimination of embedded and exposed structure 5. Definition of operation and grammatical structure On the contrary, we do not consider that a consensus of classification of timbre and its symbolization is achieved and do not attempt to define them in common. We do not consider that the definition of timbre is valuable, either. The more important thing is to define how other functions are explained using these symbols. One of the reasons of the pessimism of common symbolization is that the timbre under consideration is too wide and we cannot experience all of them. We may encounter an acoustic sound which we never experienced before. There are a variety of electronic sounds which we never come across in the past and will hear in the future. It seems desperate to define symbols that cover all timbres. Moreover, the wideness of timbre is not of one level, but hierarchical. Phoneme classification in spoken language is implicit in timbre classification, and phoneme classification itself is a big problem in the field. On the other hand, classification of sound by Schaeffer is not done physically or perceptually. The main feature of his classification is based upon musical sound, which is called objet sonore. Therefore, it can be said that classification in terms of linguistics is not that of objet sonore and meaningless. In our study, we insist that timbre theory for music composition is constructed based upon perceptive primitives of timbre represented by symbols. However, we allow various definitions of symbols: different musicians/ researchers can define different timbre symbols. One of the reasons why equivocalness is allowed comes from a problem of attention. Auditory impression of the same sound is not always the same, but depends on how it is heard. It is a good example to be able to distinguish each harmonic of a music instrumental sound by paying attention. 4.1 Classification Symbolization of Timbre and its 4.2 Phoneme Classification Articulatory Movement Based on Firstly, it is necessary to categorize timbre. This changes various timbres into collection of symbols. All the theories such as grammar of a language, harmonics, and counterpoint are all based on symbols. However, it is difficult to define the symbol. Is it ever possible to define primitives of all timbres? In the field of music acoustics, it is not common to define the micro structure of timbre. There is a consensus of a sound stream, which gives us the impression of one particular sound, such as trumpet sound, human voice and so on. However, no common notion of the smaller components of timbre is yet known. On the other hand, Schaeffer has established his own classification of timbre[l]: "Simple" and "fragments" are the terminology he defined. A fragment is composed of "elements" such as "attack", "body", and "decay". These definitions are given not only to compound sounds, but even to a single stream. Phoneme classification in the field of spoken language is a good reference to rely upon. There is a micro-structural classification called distinctive features. This is a classification in correspondence to articulatory movement. Timbre classification can inherit all of these distinctive features since it makes a link to linguistic field. 4.3 Hierarchical Structure and its Self-similar Representation Item 2 is one of the most important requirements of this timbre theory. Timbre is defined in hierarchical structure, and it also has a self-similar structure. The hierarchy and self-similarity both make it easy to construct an algorithm of timbre synthesis and timbre understanding. By adopting a self-similar structure or recursive structure, each of the timbre (music) levels can be described in terms of a lower level. The lowest level represents the spectrum of the signal and its terminates. Figure 1 depicts a Proceedings ICMC 2004

Page  00000003 sonogram of a sound and is a lowest level timbre representation. Boundaries are necessary to clarify symbols. The lighter part within the frame represents noise. Symbols here are not one dimensional series, but expressed as a disposition in a time-spatial domain. This type of structure is defined in signal level, perceptive level, and cognitive level in a self-similar manner. In the lowest level of timbre shown as in Figure 1, we call the symbols micro timbres. In the sound synthesis system Otkinshi[5] which runs on a Windows system, hierarchical and self-similar structure are implemented in a sound object, and those structures are also reflected into the GUI (Graphical User Interface). Figure 2 shows an example of a sound object defined hierarchically. At the top level, music is expressed as an icon. By double clicking, a user can reach the lower level. Upper level sound (music) is defined as a disposition of the lower level sounds (music) in a time-spatial domain. 4.4 Extension of Previous Timbre (Music) Representation Item 3 means that a new description should be an extension of traditional common music notation. This does not necessarily insist that common music notation can bear new expression. On the contrary, a particular setting of parameters in a new representation will be an equivalent to common music notation. Fig. 3 shows two types of music notation in O'kinshi. The features in common to both representations are that the horizontal axis represents time and the vertical axis represents a dimension of timbre (part, channel). In timbre notation as well as music notation, horizontal axis represents time and vertical axis represents timbre, which ranges from micro timbre to macro timbre in correspondence to the level under consideration. 4.5 Discrimination of Embedded and Exposed Structure Item 4 is a requirement needed particularly when human physical action is involved. Embedded structure represents a symbol sequence in an upper level, and exposed structure represents observed physical movement. Let's take an example of speech utterance and its recognition. Text sequence is an upper level embedded structure. But observed data is not a text sequence but a physical signal sequence where all are concatenated smoothly and not observed distinctively. So there are two structures: continuous representation of timbre in the lower level" caused by a physical movement and discrete representation of timbre in the upper level which drives the physical movements. These two types of structures occur in many other examples: dance pattern and its physical movement, music score, and actual performed sound. The upper level is always an intension, and the lower level is its practice. jl.-3a L- L.6C - I I'.A 1..I.I rA 11141! -il W '1 Figure 1 Symbolization of timbre (Signal level) I iI.-- II I.q'..I I I W, 1 - I H.i ". I.ik. I - - - -- M "lid11|1 "** 1. 11 * *..I.. -.4.. i M-.Z. _r j - -e n J. - 7.U1 9.I. 1i.. I -.n..l I..1-.pin~...II...... i%-mle-.r Figure 2. Hierarchical structure of sound object in sound synthesis system O'kinshi[,T] 4.6 Definition of Operation and Grammatical Structure Some kinds of operations are defined by symbols. This operation does not occur necessarily equivalently in all levels of the hierarchy. Figure 4 depicts a sonogram of female voice /ieaou/. Lighter portions of the figure represent harmonics of five vowels. For those harmonics, our perception groups them and makes a single stream. This well known psycho-acoustic phenomena of grouping should be reflected in symbol level operation. As an extension of a one dimensional symbol sequence like a grammar, the rule of both temporal sequence and spatial order is necessary, in which boundaries of symbols are defined in a time-spatial domain. This enables a theoretical description of musical sequence as well as of chord progression. Proceedings ICMC 2004

Page  00000004 4.7 An Example of Timbre Description In this section, let' s take a shakuhachi' s shake called "(corocoro" as an example and try to describe the sound by symbolizing a micro timbre in a hierarchy. Shakuhachi is a traditional Japanese bamboo flute, and corocoro is a trill of timbre, that is, fingering makes the timbre shaking on a different register rather than pitch shaking. Def Snd a="shakuhachi reg 1" % define a shakuhachi sound Def Snd b="shakuhachi regi" % define a shakuhachi sound Def Snd c="shakuhachi reg2" % define a shakuhachi sound Def Snd A; % define total timbre Def Pit P, p1,p2,p13; % define pitch function pl[b]-(pl[a]=p3[c] )=2; % trill of major two A=mor(seq(a,b,c,b)*"); % repetition of sequence "abcb" with smoothly concatenated. Where, Def represents define, and morO) represents morphing operation in which distinctive timbre is smoothly concatenated. El.. I Pu'." -I I. I_ ~ ~ -. ~h--- ~-~ ---- * J h.' *l *i 'I!*; i~~**' E.j~ ji' ~ SI. SI. A~, 5*F P'E El P -I I II I Figure 3. Track representation and conmon music notation *I:ii c':':] r.~I ' Conclusion After the 20th century, timbre has become an important parameter in music composition. However, no useful musical theory for music composition concerning timbre has been presented. We are addressing the construction of a timbre theory. This paper describes the requirements of such a timbre theory before proceeding to an actual theory construction. Five requirements are proposed and discussed. Definition of macro to micro timbre and operation definition are the topics of a future study. 6 Acknowledgments The author would like to express his gratitude to Prof. Hiroshi Kinukawa for his enthusiastic support and Prof. Steve Everett of Emory university for his heartful discussion. References [1] Pierre Schaeffer, "Trait6 des objets music- aux," Editions du Seuji, 1966. [2] Osaka. N., "Timbre interpolation of sounds using a sinusoidal model," Proc. ICMC 95, 408-411, Banff, 1995. [3] Hikichi, T., and Osaka. N., "New Synthesis Method for Addition of Articulations Based on a Sho-type Physical Model," Proc. ICMC 2003, 333-336., Singapore, 2003. [4] Osaka, N., "Timbre morphing and interpolation based on a sinusoidal model," ICA/ASA joint meeting, Seattle 83-84, 1998. [5] Osaka N., and Hikichi T., "Visual manipulation environment for sound synthesis, modification, and performance," Proc. ICMC 99, 429-432, Beijing, 1999. uk I I I Fill I L'JJ I Ji1i Figure 4. An example of a symbolic operation: Grouping of harmonics Proceedings ICMC 2004