~Proceedings ICMCISMCI2014 14-20 September 2014, Athens, Greece The Ghost in the MP3 Ryan Maguire Virginia Center for Computer Music [email protected] ABSTRACT The MPEG-1 or MPEG-2 Layer III standard, more commonly referred to as MP3, has become a nearly ubiquitous digital audio file format. First published in 1993 [1], this codec implements a lossy compression algorithm based on a perceptual model of human hearing. Listening tests, primarily designed by and for western-european men, and using the music they liked, were used to refine the encoder. These tests determined which sounds were perceptually important and which could be erased or altered, ostensibly without being noticed. What are these lost sounds? Are they sounds which human ears can not hear in their original contexts due to our perceptual limitations, or are they simply encoding detritus? It is commonly accepted that MP3's create audible artifacts such as pre-echo [2], but what does the music which this codec deletes sound like? In the work presented here, techniques are considered and developed to recover these lost sounds, the ghosts in the MP3, and reformulate these sounds as art. 1. TECHNICAL BACKGROUND The MP3 standard, designed in the early 1990's by the Moving Pictures Experts Group, has become an interesting object of critique in contemporary technology studies [3]. How a standard which subtly reduces the audible quality of sound files has remained in place, despite massively increased bandwidths and storage capacity is impressive, and highlights the foresight (and fortune) of the format's creators. Due to a complex combination of market and social factors, the majority of music listeners today continue to prefer a standard which optimizes the download times and storage capacity of their audio devices [4]. These are often portable machines such as the iPod, on which much listening occurs in noisy environments (gyms, subways, city streets) through (often cheap) ear bud headphones and inexpensive preamplifiers. The loss of fidelity from these external factors, along with the cleverness with which MP3s are coded, a socialization to the sound of MP3 files, and other factors have obviated the need for an upgrade to higher fidelity formats for most end users [5]. Regardless, the MP3 is not always the most appropriate format for a given task, and a critical evaluation of the technology and its limitations is warranted. Many listeners today listen exclusively to MP3 files, even in settings Copyright: 0 2014 Ryan Maguire. This is an open-access article distributed under the terms of the Creative Commons Attribution License 3.0 Unported, which permits unrestricted use, distribution, and reproduction in any medium, where the gains from a higher fidelity format would be clearly perceptible. This lossy compression codec has thus come to dominate unanticipated listening spaces. Despite its heralded performance in listening tests, the MP3 compression codec does generate audible artifacts and remove perceptible sonic information. MP3 encoding relies primarily on masking curves, used to calculate frequency and temporal masking [10]. By adjusting masking thresholds, more or less information can be removed from the uncompressed audio depending on the desired target file size. At low bit rates, due to sample rate reductions and low pass filtering, frequencies from the extreme edges of the human hearing range are further attenuated. For example, white, pink, and brown noise, when compressed to the lowest possible MP3 bit rate [6], sound very different from the original random signal. 213.2 18841.6 x.649.9~ 2 10766.6 80 75.0 -17.5 -20.0 -22.5 -25.O -27.5 -30.0 -32.5 -35.O -37.5 -40.0 261 Figure 1. White, Pink, and Brown Noise - Uncompressed WAV. 21533.2 18841.6 16149.3 13458.3 107 66. u:807 5. C -20.0 -25.0 -27.5 -30.0 -32.5 -3. R-37,5 Figure 2. White, Pink, and Brown Noise - 8kbps MP3. provided the original author and source are credited. - 24 -
Top of page Top of page