Page  00000001 Phase Models to Control Roughness in Additive Synthesis Erling Tind and Kristoffer Jensen Department of Computer Science, University of Copenhagen {earl, krist} @diku.dk Abstract This paper introduces the concepts of auditory roughness and perceptual dissonance. Using a computational auditory model, roughness prediction results are computed and analyzed. It is concluded that auditory roughness and perceptual dissonance theories result in comparable predictions for steady state sounds, except in special cases where the phase of partials are of importance. Utilizing these findings, a synthesis algorithm for the adjustment of roughness in musical sounds is proposed and implemented. The algorithm is based on additive synthesis. Informal listening tests suggest a relation between the control parameter of this algorithm and the pleasantness of the resulting sounds. 1 Introduction Perception research shows a fascinating relationship between dissonance (roughness), timbre and musical scales. Research into musical consonance suggests that the pleasantness of musical chords is determined by the relative proximity of the partials in the combined sound spectrum. In music, closely spaced partials occur not only as a consequence of musical chords, but also in harmonic sounds of low fundamental (bass sounds). This is the class of sounds that has been the subject of this study. It is shown that the perception of roughness is dependent on the relative phase of harmonic overtones, and that the shape of this dependency can be used in a model of phase, where a single p parameter is controlling the roughness of a sound. The validity and applicability of the model has been determined by further experiments, comparison with other research results and informal listening tests. In section 2, a short introduction to the perception of dissonance, roughness and phase is given, and in section 3 the computational auditory model of dissonance and roughness is presented and discussed. Section 4 presents a number of simulation results, based on simple stimuli with a roughness model. Section 5 presents the roughness reducing propagation formula, which provides the basis for an implementation. In section 6 this design is evaluated with a combination of computer simulation results and informal listening experiments. Finally a conclusion of the work is given. 2 Perception of dissonance The sounding of musical instruments can to varying degrees be thought of as pleasant or "euphonious". This is potentially the case of a single tone, but more commonly the case with multiple tones played in harmony. This is also known as consonance. This section will introduce a theoretical framework of consonance, dissonance and phase perception thereby providing an understanding of the perceptual problems that is addressed in this paper. 2.1 Perceptual categories of two pure tones The simultaneous sounding of two pure tones can be classified into three different perceptual categories: Beats (when two pure tones appear relatively close), roughness (when two pure tones appear more than approximately 20 Hz apart) and tone-separation (when the two tones appear in separate critical bands). 2.2 Critical bandwidth The hearing system is shaped to perform a spectrographic analysis of the auditory stimulus. The cochlea can be regarded as a bank of filters whose outputs are ordered tonotopically (close frequencies are positioned close in space), so that a frequency-to-place transformation is effectuated. The concept of critical bandwidth refers to the effective width in frequency for the spread of energy on the basilar membrane, when stimulated with a pure tone. 2.3 Dissonance of two pure tones In the previous century, Helmholtz suggested that the perception of consonance and dissonance could be understood in terms of the presence or absence of rapid beating between the sinusoidal components of a complex tone (Helmholtz 1954). Proceedings ICMC 2004

Page  00000002 The idea was later concretized by Plomp and Levelt (1965) in a series of experiments in which subjects reported the relative consonance for intervals of pure tones. They furthermore offer a standard curve of the consonance. These curves, for different frequencies, all share the same qualitative properties: They begin at unison with high consonance, and rise rapidly to a maximum dissonance, then slowly decrease. Later investigations by Kameoka and Kuriyagawa (1969) have revealed similar, but more complex findings. By comparison they showed that the maximum achievable consonance and dissonance is a function of frequency, with largest difference between consonance and dissonance reached at frequencies around 440 Hz. Less differences were found at lower frequencies. By the same methodology they concluded that a pure tone does not constitute a perfect consonance. Again this is especially true for low frequencies. This latter finding is surprising, as it is not predicted by the basic assumption of tonotopic dissonance. 2.4 Perceptual importance of monaural phase The audibility of phase shifts in harmonically related tones has been a topic of discussion for many years. The general view of the audibility of phase shifts appears to be that in human auditory perception, phase information is not very important. This view can be traced back to Ohm's law of hearing (Helmholtz 1954) stating that auditory perception is determined by the strength of spectral components in a stimulus and is independent of phase. More recent experiences have challenged this position, indicating that phase information plays an important role in high-quality speech synthesis and in the synthesis of musical instruments. Likewise Andersen and Jensen (2002) have shown that the mean degradation of voice and instrument sounds in additive analysis/synthesis is reduced with phase preserving methodologies. As one of the first to study the effect of phase in complex sounds, Schroeder (1959) reported a number of effects related to sounds of up to 31 harmonics. Most interesting is the reported strong dependence between timbre and "peak factor", defined as the difference between the maximum and minimum instantaneous amplitude, divided by the root-meansquare value. Plomp and Steeneken (1969) investigated the effect of phase on the timbre of a steady state harmonic tones with an amplitude pattern of -6 dB/oct. Based on a fundamental frequency of 292.4 Hz they concluded that stimuli consisting of sine or cosine terms is clearly distinguishable from stimuli consisting of alternating sine and cosine terms, and this difference appears to represent the maximal possible effect of phase on timbre. By further experiments they found that the effect of phase on timbre diminishes with increasing fundamental frequency. For a fundamental of 146.2 Hz, the lowest investigated, the effect was found quantitatively equal to the effect of changing the slope of the amplitude pattern by about 2 dB/oct. The above conclusion appeared largely independent of sound pressure level. Patterson (1987) investigated the ability to discriminate between two complex flat-spectrum sounds with 31 partials, with cosine or alternating phase. Using noise-masking to prevent combination-tone influence, he investigated how repetition-rate, spectral location, bandwidth level and duration factors influences the discrimination ability. The results are that alternating phase, depending on the alternating phase degree, is discriminable, more so for low pitch, high spectral location, high level, but independent of duration. Pressnitzer and McAdams (1999) investigated the perception of roughness from different AM and QFM stimuli using phase changes of the center partial. They concluded that the resulting roughness sensation was dependent of phase, for instance giving asymmetry between positive and negative phase changes. 3 Computational Auditory Models For practical use in computer music, existing experimental data on dissonance need to be incorporated into an auditory computer model, to enable the prediction of perceived dissonance in arbitrary sounds or in a piece of music. The challenge of designing such a model can be approached by the method of curve-mapping (curve-fitting psychoacoustic experimental results) or by an auditory model. Curve-mapping models perform a mapping of the frequency component pairs of a sound onto a psycho-acoustical curve which expresses the dissonance value of the presented pair. Total dissonance is derived by a summation of the dissonance values. As such curve-mapping models are limited to the classes of sounds consisting of distinct partials. Auditory models rely on signal processing for simulations of the auditory system and provide a greater scope in sounds, and more explanatory value of the underlying phenomena. 3.1 Leman - The synchronization index model The synchronization index model (SIM) by Leman (2000) employs a functional model of the auditory periphery, and a method to predict the roughness of a sound. The model of the auditory periphery (called APM) produces an 'auditory nerve image' in the form of a rate-code. This output effectively contains the probability of neural firing in a given auditory channel for a given time instance. The computation of this Proceedings ICMC 2004

Page  00000003 includes the following steps: Simulating the frequency response of outer and middle ear, cochlear mechanical filtering, by filtering frequencies into 40 overlapping sub-bands (Resolution of 1 Bark), a hair cell model, including half-wave rectification and dynamic range compression and a low-pass filter at 1250 Hz. The roughness model by Leman is based on a concept of a neural synchronization. This concept originates in neurophysiology, and indicates the degree of a neuron's total firing rate that is phase-locked to the corresponding stimulus component. This view of dissonance was recently supported in a study of neural correlates in the auditory cortex of humans and monkeys, subjected to consonant and dissonant stimuli (Fishman, Volkov, Noh, Garell, Bakken, Arezzo, Howard, and Steinschneider 2001). The SIM model calculates roughness in terms of the 'energy' of this neural synchronization. As there are multiple channels, roughness is calculated by summing the roughness from each cochlea channel. Lemans implementation of the SIM model is a part of the IPEM Toolbox', a Matlab toolbox developed at Institute for Psychoacoustics and Electronic Music (IPEM), Ghent University, Belgium. 3.2 Other Models Kameoka and Kuriyagawa (1969) published a computational model based on their psychoacoustic research on the perception of dissonance. This model has been re-implemented and made available2. Several computational models for the calculation of dissonance have been published recently. One of the most inspiring is the (sensory) dissonance model by Sethares (1993). The model takes its basis in a parameterization of the Plomp and Levelt curves of dissonance, and by adding the amplitude weighted individual partial dissonances, Sethares calculate dissonances for complex sounds. There is no dependency on phase in Sethares model. Sethares work is available as a set of matlab routines3. Inspiration, from Terhardt's notion of virtual pitch and an understanding of the connection between dissonance and roughness, has lead to a model of sensory dissonance by Skovenborg and Nielsen (2002). Skovenborg and Nielsens work is based on HUTear4, a Matlab toolbox developed at the Laboratory of Acoustics and Audio Signal Processing at Helsinki University of Technology. 1http: //www.ipem.rug.ac.be/Toolbox/ 2http://www.music-cog.ohio-state.edu/Music829B/ diss.html 3http://eceserv0.ece.wisc.edu/~sethares/comprog. html 4http://www.acoustics.hut.fi/software/HUTear/ Interest has been addressed at The Auditory Image Model of Peripheral Auditory Processing5(AIM). It separates itself from the above mentioned models by including both a computationally efficient functional route and an alternative physiological route. This physiological route includes basilar membrane motion simulation, based on a non-linear transmission line filter bank, described in (Giguere and Woodland 1994). 4 Simulations In order to verify the predicted validity of the time domain model by Leman (2000), simulations of the roughness of pure tones were performed. These simulations show that Leman's model reproduces many auditory phenomena (Tind 2004). However, many of these simulations also show a sensitivity to phase. The following figures were all computed by series of discrete simulations. For the time domain model of Leman, this was performed with sound clips of 0.27 seconds at a sample rate of 22.5 kHz. A fade-in window was applied to reduce potential ringing in the cochlea simulation filters. This results in five consecutive roughness analysis frames, of which the central frame was selected as the result. 4.1 One pure tone As indicated by Kameoka and Kuriyagawa (1969), a pure tone does not constitute a perfect consonance. To test Leman's model for this attribute, the predicted roughness of a frequency sweep of a pure tone was calculated with this model. In agreement with expectation, the result (see figure 1) illustrates a curve of frequency dependent roughness. It can be observed that roughness is present for pure tones of frequencies below approximately 300Hz and reaches a maximum near 50Hz. The roughness ratings achieved are not directly comparable to the consonance ratings by Kameoka and Kuriyagawa (1969) due to the different scales, but there is a close similarity. 4.2 Two pure tones Effect of frequency interval of two pure tones. Tested at a mean frequency of 1000 Hz, the roughness is calculated on two pure tones with varying frequency difference. The result is comparable to the standard curve published by Plomp and Levelt (1965). The maximum roughness is found with a frequency difference of 64Hz, comparable to the peak frequency published by Plomp and Levelt (1965), found to be between 40 Hz and 65 Hz. 5http://www.mrc-cbu.cam.ac.uk/cnbh/web2002/ bodyframes/AIM.htm Proceedings ICMC 2004

Page  00000004 4 3.5 3 2 1.5 0.5 102 103 Frequemcy (Hz) Figure 1: Roughness of a pure tone as a function frequency. The simulation is based on Leman's model. x 10-~ 50 Hz 1.5',/ U, a, t0 r < \\& LL 0.5 0o 75 Hz, though less strong. In addition, the roughness does not fall to zero when the two pure tones have the same frequency. This is in agreement with the roughness of one pure tone (as seen in figure 1). This roughness effect of a pure tone was not represented in the standard consonance curve of Plomp and Levelt (1965). However, the factual psycho-acoustic experiment for 125 Hz did show this behavior. Effect of frequency on maximum roughness. A search for maximum roughness of Leman's model of an AM-modulated signal at different carrier frequencies, shows a qualitative similarity to the results from Zwicker, Fastl, and Frater (1999), indicating a region of middle frequencies (100Hz to 2000Hz) to be especially important to the perception of roughness. This is in general agreement with the psychoacoustical data of both roughness (Zwicker, Fastl, and Frater 1999) and dissonance (Kameoka and Kuriyagawa 1969). 4.3 Three pure tones. The dissonance curves published in the literature are, as previously mentioned, based on experiments with two pure tones. In curve-mapping models, the gap from two pure tones to the general case of many partials, is bridged by the assumption that dissonance is additive. This appears to be a valid assumption for the dissonance of simple musical cords (dyads). - But is the additive assumption also valid in the case of multiple closely spaced partials? Effect of frequency intervals. To investigate the roughness of more than two partials, the dissonance calculations were extended to cover a roughness plane based on three pure tones. In the calculated planes a center pure tone (termed f2) is held at a constant frequency, while the X and Y axes represent frequencies of each neighboring pure tone (fi and f3). An example of this can be seen in Figure 3. Under the additive assumption, the dissonance planes are expected to show an increase in roughness, in the areas representing three pure tones in close proximity. However, unmodulated noise can be interpreted as the upper limit condition for the addition of pure tones. Unmodulated noise results in a relatively small subjective sensation of roughness (Aures 1985). It is therefore interesting that maximum roughness more than doubles in the scenario of three pure tones of equal amplitude, as compared to two pure tones. An interesting feature of the roughness plane based on Leman's model (Figure 3) is a jagged ridge shape of maximum roughness, visible on top of the protrusion of the right hand comer extending diagonally across the figure. This ridge shape is caused by an increase in roughness at frequency sets of symmetric frequency distance (including harmonic partials). 100 150 6 4 2 Frequency (Hz) Phase (rad) Figure 2: Roughness of two pure tones of equal amplitude. One is fixed at 50 Hz and zero phase, while the ground axis of the plot shows the frequency and phase angel of the second tone. Effect of phase. As frequencies enter the roughness range of pure tones, below 300 Hz, phase effects of two pure tones are found. This is exemplified in figure 2. The figure is based on roughness measures from two pure tones of equal amplitude. The first pure tone is held at a constant frequency of 50 Hz and phase angle of zero. The second pure tone is modified in frequency and phase relative to the figure axes. To achieve sufficient resolution, the time window of roughness estimation was extended. This makes the resulting roughness values smaller than with the default parameters used in the other simulations. The overall shape of the graph reflects the Plomp and Levelt (1965) curves. A closer inspection, however, reveals a few deviations. The most significant of these appears at 100 Hz, where the roughness intensity shows a strong dependence on phase. A roughness dependancy on phase is found again at Proceedings ICMC 2004

Page  00000005 The presented figure was obtained at a mean frequency of 1000Hz. A similar pattern was found at both lower (above 300 Hz) and higher frequencies. X 10-3 The frequency combinations (950, 1000 and 1050 Hz) extracted from the ridge shape results in a diagonal aligned wave-shape (figure 4) roughly following the form cos(O + 03). Hence the shallows of the waves (corresponding to minimal roughness) are approximately aligned at phase combinations satisfying the condition (01 + 03) mod 27 = 7. (1) cn 1.5, Cr 0.5, a, 1300 1200 1100 Frequency of third tone 1000 For the frequency set 950, 1000 & 1050 Hz the predicted roughness values (figure 4) vary between 7 - 10-4 and 18 - 10-4 constituting a ratio of 2.5. For the frequency set 960, 1000 & 1060 Hz the roughness values are near constant 13 - 10-4. These results indicate that the unexpectedly roughness of harmonic frequencies seen in figure 3, is a consequence of the specific phase relationship. 900 800 Frequency of first tone Figure 3: Dissonance/roughness, using Leman's model, of three pure tones of equal amplitude. Frequency of second tone is 1000 Hz. Effect of phase. The jagged pattern of roughness, observed in figure 3, led to several further investigations focusing on potential phase dependencies on roughness. Specifically it was investigated if the roughness rise found at harmonic frequencies was sensitive to the phase of the pure tones. To answer this question several simulation were performed, in which the phase angles 01 and 3 were modified relative to 02, while the frequencies (fi, f2 and f3) were being held constant. 1000 Hz x 10-3 100 Hz x10 ~-c~s~ 2-> CY, 0" Phase - 150 Hz 0 0 Phase - 50 Hz Figure 5: Roughness of three pure tones as a function of phase. 50, 100 150 Hz. Frequency dependency. In the previous section all simulations were performed with similar center partial frequency (f2 1000 Hz). It is therefore of interest so see if the same phenomena can be found at other frequencies. To asses this, additional simulations were performed. For the higher frequency of f2 = 5000 Hz and Af = 50 Hz the same wave pattern in roughness as a function of phase angle is found. The computed roughness values are lower and vary between 8.4 - 10-4 and 1.5 - 10-4, constituting a ratio of 5.6. This decrease in roughness is to be expected, but it is interesting that the ratio between the values is larger. This indicates that the phase of higher partials could be perceptually more important. However, for a harmonic sequence with a fundamental of 50 Hz, this would refer to the 39th to 41st partial where, in most instrument timbres, little energy is expected to be present. For the lower center partial frequency of f2 200 Hz the wave shape pattern appears altered and more complex. The Phase - 1050 Hz 0 0 Phase - 950 Hz Figure 4: Roughness of three pure tones as a function of phase. Phase angle of middle partial is held constant. Frequency delta equals 50 Hz. The resulting roughness shows different patterns. Some results in constant values, while others have wave shapes. Proceedings ICMC 2004

Page  00000006 roughness values found vary between 9.5.10-4 and 6.3.10-4, constituting a ratio of just 1.5. The altered shape of phase sensitivity can be viewed as surprising, since the audio signal undergoes the same steps of transformation in all auditory channels of the Leman model. The alteration of the wave shaped pattern not only affects harmonic frequency sets, but also frequency sets of unequal Af. As an example the flat roughness plane found with frequencies 960, 1000 and 1040 Hz is not reproduced with a frequency set of 160, 200 & 260 Hz, showing a complex wave shape. 4.4 Discussion In general the alteration of roughness plane shapes appears for center partial frequencies below 300 Hz. Reaching a center partial frequency of 100 Hz this process is largely completed, and the simulated roughness plane retakes the form of a simple wave shape, but rotated 900 along the axis of roughness (see figure 5). This result is in contradiction with the psychoacoustic experiments of Pressnitzer and McAdams (1999), although their roughness effects seem to decrease at 125 Hz. Pressnitzer and McAdams (1999) simulations results, based on AIM, shows very little roughness effects already at 250 Hz. Generally, these results corroborate the results in figure 4. As the change in roughness plane in low frequencies is uncorroborated, it is not taken into account in this work. As to other factors influencing roughness, Patterson (1987) found that phase discrimination increases with level, is independent of duration, and increases with frequency (for same fundamental). In addition, simulation and informal listening tests show an increased roughness for three pure tones with unequal amplitude, up to a maximum, undetermined, level. None of these factors significantly influence the shape of the phase dependent roughness in harmonic tones, however. 5 Roughness reduction strategies In this section, changes to the phase spectrum of the original sound, in order to reduce the level of auditory roughness in harmonic sounds of low fundamental, is discussed. As previously stated the goal is to increase pleasantness while at the same time preserving aspects of timbre. This is supported by informal listening tests. 5.1 Peak factor minimization Schroeder (1959) reported a strong dependence between timbre and peak factor. Peak factor is a simple measure, defined as the difference between the maximum and minimum instantaneous amplitude of a signal divided by its root-meansquare value. The advantage of addressing peak factor instead of roughness is that the former can be minimized via an analytical approximation equation (Schroeder 1970). While Schroeder phase signals tend to exhibit relatively low roughness, the method only addresses the overall signal envelope. The signal envelope for a limited frequency range (the critical band) is therefore ignored. By this it follows, from the SIM model of roughness, that the method is unlikely to produce minimum roughness signals. Another approach is to investigate the perceptual influence of phase on two and three component signals. As seen in section 4, simulations on three pure tones show a strong dependency of phase on roughness. These results are here used to model the roughness-dependency of phase using one p parameter. 5.2 Three component phase formula Studying figure 4 and further simulations of other center partial frequencies and different center tone phases (0b2 # 0), it appears that the identified wave pattern is preserved for frequencies above 300 Hz, though it is moved diagonally according to the value of 02. This suggests that the wave tops for these frequencies can be approximated 6 with the equation 3 = 202 - 01. It follows that wave shallows can be approximated by (2) (3) 03 = 202 - 01 ~ 7T. In other words the above equation 3 predicts all the phase combinations resulting in minimum roughness. 5.3 p phase formula The previous equations 2 and 3 represent special cases in the sinusoidal roughness shape observed in figure 4. To connect and generalize the two equations a roughness parameter p is introduced. Utilizing p, the offset parameter 7 in equation 3 is substituted by the inverse sinusoidal function, resulting in: 03 = 202 - 01 + sin-l( - 2p) + - 2 0 < p < 1. (4) This is intended so p equal zero corresponds to minimum achievable roughness, and p equal one the corresponding maximum. It is of significance to realize that generally the same roughness level is found at two different b3 values. For this reason equation 4 is one of two potential inverse mappings. 6Phase wrapping issues are ignored in the following equations, since they are of no practical importance in the derived formulas. Proceedings ICMC 2004

Page  00000007 5.4 Propagation phase formula Equation 4 can be extended to the general case of many harmonic partials. Under the idealized assumption of a rectangular auditory filter shape with an approximate bandwidth of 2.5 times the fundamental frequency, the effect of neighboring partials can be ignored. This leads to the following propagation formula where rn identifies the partial number: ~n =24,1 ~--2+ sn-'l 2) +~ n> 2 (2 Independent variables for the above propagation formula are the roughness parameter p and phase of the first two phase partials ~b1 and ~b2. Of these, ~b1 constitutes a time reference of the fundamental period. However, the relationship between ~b1 and ~b2 significantly influences the waveform of the resulting signal. Results obtained with this equation share some interesting properties. The low roughness initial condition ~1=0, (72 0.5wi and p 0 is in line with findings of Plomp and Steeneken (1969), as it corresponds to the condition of alternating sine and cosine terms. The alternating condition of all sine or all cosine terms is found with the initial condition ~b1 0, ~b2 0 and p 1 and it renders an all sinus phase. According to equation 5 vastly different phase patterns will produce the same roughness, dependent on the two first phase conditions. The data flow for the proposed roughness controlled synthesizer is thus as follows: In the first step the individual sinusoidal components are made available. After this an optimized partial phase relationship by equation 5 is calculated. These steps can be considered the center of the process and they are currently being implemented in the Timbre Engine (Marentakis aLnd Jensen 2002). 6 Evaluation The p-factor propagation formula was derived from simulation results of three equal amplitude pure tones. It is assumed that these results could be generalized to make predictions for the roughness of sounds consisting of many harmonic partials of possibly uneven amplitude. To test this assumption, comparisons between p-factor and resulting roughness (estimated with Leman's model) were performed. The comparison was based on a synthetic signal of ten harmonic partials, where the partial amplitudes Ak were defined by Ak 09 (6) where k is the partial number. The factor 0.9 was selected from subjective experimentation, allowing for both low and high partials to be clearly perceived. The relatively low number of partials (10) was selected to reduce the influence of many partials within the same critical bands. 6.1 M~odel based evaluation Simulations for comparison of p-factor with roughness for these complex sounds were conducted at four fundamental frequencies from 25 Hz to 100 Hz. Mean roughness values were calculated from eight different ~b2 parameters (steps of 5 to convtrol for al p~ossble depndernce/ of this~ pararmeter. The computed mean roughness trajectories are presented at the four frequencies in figure 6. The axes of roughness is logarithmic. Cl), 10 -a, 10 - 200 Hz 100 Hz --50 Hz,I ~~~.25 Hz 0 0.2 0.4 Ro0.6 0.8 1 Figure 6: Mean roughness, as calculated by Leman's model, for four fundamental frequencies as a function of p. Sounds synthesized with 10 harmonic partials. In figure 6, for the fundamental frequency of 100 Hz, the simulations are seen to result in a near-linear relationship between p-factor and roughness. A similar trajectory was found at the fundamental frequency of 200 Hz, but at a approximately 10 fold decrease in roughness. For the fundamental 50 Hz, a reduced correlation between p-factor and roughness is found. A relation between roughness and p-factor values is found for the interval 0.1 to 1. However, the figure shows an abrupt rise in roughness for p-factor values close to zero. For a fundamental of 2,5Hz, correlation between p-factor and roughness is only found for the p-factor values of the interval 0.8 to 1. For values in the range of 0.1 to 0.8, an unexpected high sensitivity to the ~b2 parameter of initial phase relationship was found. As was also the case at,50Hz, there is an abrupt rise in roughness for p-factor values close to zero. A relative insensitivity to ~b2 parameter of initial phase relationship was found for all frequencies above 25Hz. In summary, a correlation between p-factor values and roughness calculations by Leman's model is found for fundamental frequencies between,50Hz and 200Hz. Proceedings ICMC 2004

Page  00000008 6.2 Informal listening test In an attempt to reproduce the model findings of figure 6, an informal listening test was conducted. Five sounds of 3 seconds duration, including linear fade-in and fade-out segments of 0.1 seconds, with partial amplitudes defined by equation 6, were synthesized. The phase relationship of the partials was determined by the p-factor propagation formula, with p-factors: 0, 0.25, 0.5, 0.75 & 1. Initial phase angles b1 and b2 were set to zero. Informal listening tests confirmed that the p-factor affects the perception of the tested sound class. However, the effect is for one part rather subtle, and for the other part, even though the phase change was always audible, it was commented that it resulted in a change in the perception of pitch, sharpness and warmth, in addition to the change of roughness. Viewed from a more positive perspective, it is evident that the p-factor results in consistent changes in timbre. A p-factor of one generally sounds "rough", a p-factor near 0.5 more "warm", or "pleasant", whereas a pfactor of zero generally sounds "sharp". 7 Conclusions The primary accomplishment of this work has been in the presented design proposal for a method to control of auditory roughness in musical sounds of low fundamental. This design has been initiated through a theoretical investigation of perceptual phenomena related to roughness. The synchronization index model (SIM) of roughness (Leman 2000), could account not only for many historical findings in dissonance research, but also some important findings of the effect of phase on timbre. On this background the model was selected as a suitable simulation reference. Literature studies and simulations with the synchronization index model of roughness lead to the insight that roughness is affected by the phase relationship of, for instance, harmonic partials. An efficient propagation formula for the calculation of phase dependent on a given roughness value is proposed. The p-factor propagation formula shows promising results in model simulations. Its results are well correlated with the results from the SIM model, and informal listening tests has indicated some significance of the propagation formula. The propagation formula is being implemented for the generation of synthetic sounds, where the perceptual important features of phase potentially is controlled via the p-factor parameter. References Andersen, T. H. and K. Jensen (2002). Importance of phase in sound modeling of acoustic instruments. In Proceedings of the Mosart Midterm Meeting. HCO Tryk. Aures, W. (1985). Ein berechnungsverfahren der raughigkeit. Acustica 58, 268-281. Fishman, Y. I., I. O. Volkov, M. D. Noh, P. C. Garell, H. Bakken, J. C. Arezzo, M. A. Howard, and M. Steinschneider (2001, December). Consonance and dissonance of musical chords: Neural correlates in auditory cortex of monkeys and humans. Journal ofNeurophysiology 86, 2761-2788. Giguere, C. and P. C. Woodland (1994, January). A computational model of the auditory periphery for speech and hearing research. i. ascending path. Journal of the Acoustical Society ofAmerica 95(1), 331-342. Helmholtz, H. (1954). On The Sensations of Tone (2nd English ed.). New York: Dover. Kameoka, A. and M. Kuriyagawa (1969). Consonance theory part i: Consonance of dyads. J. Acoust. Soc. Am. 45(6), 1451-1459. Leman, M. (2000, December 7-9). Visualization and calculation of the roughness of acoustical musical signals using the synchronization index model (sim). In Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00), Verona, Italy. DAFX. Marentakis, G. and K. Jensen (2002). The timbre engine: progress report. In Proceedings of the Mosart Midterm Meeting, Esbjerg, Denmark. Patterson, R. D. (1987). A pulse ribbon model of monaural phase perception. J. Acoust. Soc. Am. 82(5), 1560-1586. Plomp, R. and W. Levelt (1965). Tonal consonance and critical bandwidth. J. Acoust. Soc. Am. 38, 548-560. Plomp, R. and H. J. M. Steeneken (1969). Effect of phase on the timbre of complex tones. J. Acoust. Soc. Am. 46(2 (part 2)), 409-421. Pressnitzer, D. and S. McAdams (1999, May). Two phase effects in roughness perception. J. Acoust. Soc. Am. 105(5), 2773 -2782. Schroeder, M. R. (1959). New results concerning monaural phase sensitivity. J. Acoust. Soc. Am. 31, 1579. Schroeder, M. R. (1970). Synthesis of low-peak-factor signals and binary sequences with low autocorrelation. IEEE Transactions on information theory 16, 85-89. Sethares, W. A. (1993, September). Local consonance and the relationship between timbre and scale. J. Acoust. Soc. Am. 94(3), 1218-1228. Skovenborg, E. and S. H. Nielsen (2002, September 26-28). Measuring sensory consonance by auditory modelling. In Proc. of the 5th Int. Conference on Digital Audio Effects (DAFX-02), Hamburg, pp. 251-256. Tind, E. (2004, March). The consonizer effect: Methods for the reduction of roughness in harmonic sounds, based on results of auditory models. Master's thesis, Department of Computer Science, University of Copenhagen. Zwicker, E., H. Fastl, and H. Frater (1999). Psychoacoustics: Facts and Models (Second ed.). Springer Verlag. Proceedings ICMC 2004