~Proceedings ICMCISMCI2014 14-20 September 2014, Athens, Greece Musical Timbre and Emotion: The Identification of Salient Timbral Features in Sustained Musical Instrument Tones Equalized in Attack Time and Spectral Centroid Bin Wu', Andrew Horner, Chung Lee2 'Department of Computer Science and Engineering, Hong Kong University of Science and Technology 2The Information Systems Technology and Design Pillar, Singapore University of Technology and Design {bwuaa, horner}@cse.ust.hk, ABSTRACT Timbre and emotion are two of the most important aspects of musical sounds. Both are complex and multidimensional, and strongly interrelated. Previous research has identified many different timbral attributes, and shown that spectral centroid and attack time are the two most important dimensions of timbre. However, a consensus has not emerged about other dimensions. This study will attempt to identify the most perceptually relevant timbral attributes after spectral centroid and attack time. To do this, we will consider various sustained musical instrument tones where spectral centroid and attack time have been equalized. While most previous timbre studies have used discrimination and dissimilarity tests to understand timbre, researchers have begun using emotion tests recently. Previous studies have shown that attack and spectral centroid play an essential role in emotion perception, and they can be so strong that listeners do not notice other spectral features very much. Therefore, in this paper, to isolate the third most important timbre feature, we designed a subjective listening test using emotion responses for tones equalized in attack, decay, and spectral centroid. The results showed that the even/odd harmonic ratio is the most salient timbral feature after attack time and spectral centroid. 1. INTRODUCTION Timbre is one of the most important aspects of musical sounds, yet it is also the least understood. It is often simply defined by what it is not: not pitch, not loudness, and not duration. For example, if a trumpet and clarinet both played A440Hz tones for 1s at the same loudness level, timbre is what would distinguish the two sounds. Timbre is known to be multidimensional, with attributes such as attack time, decay time, spectral centroid (i.e., brightness), and spectral irregularity to name a few. Several previous timbre perception studies have shown Copyright: 2014 Bin Wu', Andrew Horner', Chung Lee2 et al. This is an open-access article distributed under the terms of the which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. [email protected] spectral centroid and attack time to be highly correlated with the two principal perceptual dimensions of timbre. Spectral centroid has been shown to be strongly correlated with one of the most prominent dimensions of timbre as derived by multidimensional scaling (MDS) experiments [1, 2, 3, 4, 5, 6, 7, 8]. Grey and Gordon [1, 9] derived three dimensions corresponding to spectral energy distribution, temporal synchronicity in the rise and decay of upper harmonics, and spectral fluctuation in the signal envelope. Iverson and Krumhansl [4] found spectral centroid and critical dynamic cues throughout the sound duration to be the salient dimensions. Krimphoff [10] found three dimensional correlates: (1) spectral centroid, (2) rise time, and (3) spectral flux corresponding to the standard deviation of the time-averaged spectral envelopes. More recently, Caclin et al. [8] found attack time, spectral centroid, and spectrum fine structure to be the major determinates of timbre through dissimilarity rating experiments. Spectral flux was found to be a less salient timbral attribute in this case. While most researchers agree spectral centroid and attack time are the two most important timbral dimensions, no consensus has emerged about the best physical correlate for a third dimension of timbre. Lakatos and Beauchamp [7, 11, 12] suggested that if additional timbre dimensions exist, one strategy would be to first create stimuli with identical pitch, loudness, duration, spectral centroid, and rise time, but which are otherwise perceptually dissimilar. Then, potentially multidimensional scaling of listener dissimilarity data can reveal additional perceptual dimensions with strong correlations to particular physical measures. Following up this suggestion is the main focus of this paper. While most previous timbre studies have used discrimination and dissimilarity to understand timbre, researchers have recently begun using emotion. Some previous studies have shown that emotion is closely related to timbre. Scherer and Oshinsky found that timbre is a salient factor in the rating of synthetic tones [13]. Peretz et al. showed that timbre speeds up discrimination of emotion categories [14]. Bigand et al. reported similar results in their study of emotion similarities between one-second musical excerpts [15]. It was also found that timbre is essential to musical genre recognition and discrimination [16, 17, 18]. Eerola - 928 -
Top of page Top of page