Loudness, loudness war: resources & bibliography
D. Ward and J.D. Reiss.. "Loudness algorithms for automatic mixing." 2nd AES Workshop on Intelligent Music Production, London UK, 13 Sep. 2016.
|Accurate loudness measurement is imperative for intelligent music mixing systems, where one of the most fundamental tasks is to automate the fader balance. The goal of this short paper is to highlight state-of-the-art loudness algorithms to the automatic mixing community, and give insight into their differences when applied to multi-track audio.|
E. Deruty and F. Pachet. "Why is studio production interesting?» Part 2, "Why are limiters interesting?". Tutorial T3 at ISMIR 2016. Columbia University, NYC, USA, Aug. 6th, 2016.
Video on YouTube. Slides and audio example.
Part 2: “Why are limiters interesting?”
Starting at the beginning of the ‘80s and peaking in 2007, the loudness war has shaped the 30 last years of music.
The tutorial follows the “Why X is interesting”- http://www.flow-machines.com/why-x-is... series that aims at bridging the gap between technology-oriented and music-related research. It suggests a number of reasons why production is important for MIR, seen from the eyes of an expert (the first author) and a MIR researcher (the second one).
M. Mauch, R. M. MacCallum, M. Levy, A.M. Leroi. "The evolution of popular music: USA 1960–2010."Royal Society Open Science, 2015. DOI: 10.1098/rsos.150081.
|In modern societies, cultural change seems ceaseless. The flux of fashion is especially obvious for popular music. While much has been written about the origin and evolution of pop, most claims about its history are anecdotal rather than scientific in nature. To rectify this, we investigate the US Billboard Hot 100 between 1960 and 2010. Using music information retrieval and text-mining tools, we analyse the musical properties of approximately 17 000 recordings that appeared in the charts and demonstrate quantitative trends in their harmonic and timbral properties. We then use these properties to produce an audio-based classification of musical styles and study the evolution of musical diversity and disparity, testing, and rejecting, several classical theories of cultural change. Finally, we investigate whether pop musical evolution has been gradual or punctuated. We show that, although pop music has evolved continuously, it did so with particular rapidity during three stylistic ‘revolutions’ around 1964, 1983 and 1991. We conclude by discussing how our study points the way to a quantitative science of cultural change.|
D. Ward, S. Enderby, C. Athwal, and J. D. Reiss. “Real-Time Excitation Based Binaural Loudness Meters.” 18th Int. Conf. DAFx, 2015.
|The measurement of perceived loudness is a difficult yet important task with a multitude of applications such as loudness align- ment of complex stimuli and loudness restoration for the hear- ing impaired. Although computational hearing models exist, few are able to accurately predict the binaural loudness of everyday sounds. Such models demand excessive processing power making real-time loudness metering problematic. In this work, the dy- namic auditory loudness models of Glasberg and Moore (J. Audio Eng. Soc., 2002) and Chen and Hu (IEEE ICASSP, 2012) are pre- sented, extended and realised as binaural loudness meters. The performance bottlenecks are identified and alleviated by reducing the complexity of the excitation transformation stages. The effects of three parameters (hop size, spectral compression and filter spacing) on model predictions are analysed and discussed within the context of features used by scientists and engineers to quantify and monitor the perceived loudness of music and speech. Parameter values are presented and perceptual implications are described.|
E. Deruty and F. Pachet. "The MIR perspective on the evolution of dynamics in mainstream music." Proc. of the 16th Int. Soc. for Music Information Retrieval Conf., Málaga, Spain, Oct. 2015, pp. 722-727.
|Understanding the evolution of mainstream music is of high interest for the music production industry. In this context, we argue that a MIR perspective may be used to highlight, in particular, relations between dynamics and various properties of mainstream music. We illustrate this claim with two results obtained from a diachronic analysis performed on 7200 tracks released between 1967 and 2014. This analysis suggests that 1) the so-called “loudness war” has peaked in 2007, and 2) its influence has been important enough to override the impact of genre on dynamics. In other words, dynamics in mainstream music are primarily related to a track’s year of release, rather than to its genre.|
XLS sleeve with the study's data. For the 7200 tracks, contains: year of release, AllMusic URL, AllMusic genre, AllMusic styles, and the values for the dynamic descriptors.
N. Granville-Fall. "The Mastering Loudness War: Can The Effects of Hyper-Compression and Increasing Loudness In Commercially Released and Broadcast Music Be Reduced?" Module CP6017-CASS Dissertation. BSc Music Technology Sound For Media, London Metropolitan University. Date of Submission: 15th April 2015. Supervisors: Lewis Jones and Allan Seago.
|This dissertation examines increasing loudness within music and broadcast by exploring its manipulation through technology and the use of excessive dynamic range compression, hyper-compression. Many consumers lack awareness of this issue and whilst some industry trends cannot be controlled, professionals have an obligation to protect the future enjoyment and preservation of music when its craftsmanship and quality are compromised. The beginning chapters focus on understanding loudness as a powerful sonic attribute, from perceived psychoacoustics to the desire for and function beyond music. Techniques and technology provide an understanding of how loudness is controlled, manipulated and distributed before the loudness war and hyper-compression are considered. Part 1 explores the history of our relationship to loudness and the key reasons for its incremental rise and development. Part 2 is a current analysis showing the problem, examples, academic debate and the commercial function and changing listening habits of consumers. The implications of hyper-compressed music for engineers and consumers are examined, covering technical and socioeconomic aspects before solutions are explored in reducing excessive loudness. A summary of the primary research is presented before ending with a conclusion.|
E. Skovenborg. "Measures of Microdynamics." AES 137th Convention, Oct. 2014.
|Overall loudness variations such as the distance between soft and loud scenes of a movie are known as macrodynamics and can be quantified with the Loudness Range measure. Microdynamics, in contrast, concern variations on a (much) finer time-scale. In this study six types of objective measures—some based on loudness level, some based on peak-to-average ratio—were evaluated against perceived microdynamics. A novel measure LDR, based on the maximum difference between a “fast” and a “slow” loudness level, had the strongest perceptual correlation. Peak-to-average ratio (or crest factor) type of measures had little or no correlation. The ratings of perceived microdynamics were obtained in a listening experiment, with stimuli consisting of music and speech of different dynamical properties.|
R.W. Taylor and W.L. Martens. "Hyper-Compression in Music Production: Listener Preferences on Dynamic Range Reduction." AES 136th Convention, Apr. 2014.
http://www.aes.org/e-lib/browse.cfm?elib=17169 - link to pdf
|Achieving “loud” recordings as a result of hyper-compression is a prevailing expectation within the creative system of music production, sustaining a myth that has been developing since the mid-twentieth century as a consequence of the “louder is better” paradigm. The study reported here investigated whether the amounts of hyper-compression typical of current audio practice produce results that listeners prefer. The experimental approach taken in this study was to conduct a subjective preference test requiring listeners to make a forced choice between seven levels of compression for each of five musical programs that differed in musical genre. The presented seven versions of each musical program were carefully matched in loudness as the versions were varied in compression level, and so differences in loudness per se cannot account for the differences in preferences choices observed between musical programs. In addition, it was found that subject factors such as age group, and speculatively the amount of exposure to different genres, were of considerable influence on listener preferences.|
H. Robjohns "The end of the loudness war?" Sound on Sound (Feb. 2014).
As the nails are being hammered firmly into the coffin of competitive loudness processing, we consider the implications for those who make, mix and master music.
In a surprising announcement made at last Autumn's AES convention in New York, the well-known American mastering engineer Bob Katz declared in a press release that "The loudness wars are over.” That's quite a provocative statement — but while the reality is probably not quite as straightforward as Katz would have us believe (especially outside the USA), there are good grounds to think he may be proved right over the next few years. In essence, the idea is that if all music is played back at the same perceived volume, there's no longer an incentive for mix or mastering engineers to compete in these 'loudness wars'. Katz's declaration of victory is rooted in the recent adoption by the audio and broadcast industries of a new standard measure of loudness and, more recently still, the inclusion of automatic loudness-normalisation facilities in both broadcast and consumer playback systems.
In this article, I'll explain what the new standards entail, and explore what the practical implications of all this will be for the way artists, mixing and mastering engineers — from bedroom producers publishing their tracks online to full-time music-industry and broadcast professionals — create and shape music in the years to come. Some new technologies are involved and some new terminology too, so I'll also explore those elements, as well as suggesting ways of moving forward in the brave new world of loudness normalisation.
E. Deruty and D. Tardieu, "About Dynamic Processing in Mainstream Music." Journal of the Audio Engineering Society, Volume 62, Issue 1/2, pp. 42-55, Jan. 2014.
|In this article, we study the evolution of music level and music level variation between 1967 and 2011. To do so, we suggest a set of signal features, and examine the impact of limiters and compressors on a corpus of music tracks. As a result, we find that some of the dynamic processes used during music production may be retrieved from the signal. Then, we examine the evolution of studio practices in relation to dynamic processing during the past five decades, as well as the evolution of the relevant signal features. In particular, this results in a characterization of the loudness war. We find for instance that it may have peaked in 2004, resulted in reduction of peak salience, but did not result in any reduction of long-term musical dynamics.|
Additional info for this paper:
Global descriptors: values for each track of the corpus (RMS, EBU3341, HLSD, CF, EBU3341, PRRC).
Windowed descriptors: Matlab code.
Windowed crest factor: values for each track.
Windowed RMS: values for each track.
Windowed loudness: values for each track.
Windowed PRRC: values for each track.
J. Hjortkjær, M. Walther-Hansen, "Perceptual Effects of Dynamic Range Compression in Popular Music Recordings." Journal of the Audio Engineering Society, Volume 62, Issue 1/2, pp. 42-55, Jan. 2014.
|The belief that the use of dynamic range compression in music mastering deteriorates sound quality needs to be formally tested. In this study normal hearing listeners were asked to evaluate popular music recordings in original versions and in remastered versions with higher levels of dynamic range compression. Surprisingly, the results failed to reveal any evidence of the effects of dynamic range compression on subjective preference or perceived depth cues. Perceptual data suggest that listeners are less sensitive than commonly believed to even high levels of compression. As measured in terms of differences in the peak-to-average ratio, compression has little perceptual effect other than increased loudness or clipping effects that only occur at high levels of compression. One explanation for the inconsistency between data and belief might result from the fact that compression is frequently accompanied by additional processing such as equalization and stereo enhancement.|
A. J. R. Simpson, M. J. Terrell and J. D. Reiss. "A Practical Step-by-Step Guide to the Time-Varying Loudness Model of Moore, Glasberg, and Baer (1997; 2002)." 134th Convention of the Audio Engineering Society, May 2013.
|In this tutorial article we provide a condensed, practical step-by-step guide to the excitation pattern loudness model of Moore, Glasberg, and Baer [J. Audio Eng. Soc., vol. 45, 224–240 (1997 Apr.); J. Audio Eng. Soc., vol. 50, 331–342 (2002 May)]. The various components of this model have been separately described in the well-known publications of Patterson et al. [J. Acoust. Soc. Am., vol. 72, 1788–1803 (1982)], Moore [Hearing, 161-205 (Academic Press 1995)], Moore et al. (1997), and Glasberg and Moore (2002). This paper provides a consolidated and concise introduction to the complete model for those who find the disparate and complex references intimidating and who wish to understand the function of each of the component parts. Furthermore, we provide a consolidated notation and integral forms. This introduction may be useful to the loudness theory beginner and to those who wish to adapt and apply the model for novel, practical purposes|
D. Pestana, J. D. Reiss, and A. Barbosa, “Loudness Measurement of Multitrack Audio Content using Modifications of ITU-R BS.1770,” 134th AES Convention, 2013.
|The recent loudness measurement recommendations by the ITU and the EBU have gained widespread recognition in the broadcast community. The material it deals with is usually full-range mastered audio content, and its applicability to multitrack material is not yet clear. In the present work we investigate how well the evaluated perception of single track loudness agrees with the measured value as defined by ITU-R BS.1770. We analyze the underlying features that may be the cause for this disparity and propose some parameter alterations that might yield better results for multitrack material with minimal modification to their rating of broadcast content. The best parameter sets are then evaluated by a panel of experts in terms of how well they produce an equal-loudness multitrack mix, and are shown to be significantly more successful.|
Z. Chen and G. Hu, “A Revised Method of Calculating Auditory Excitation Patterns and Loudness for Time-Varying Sounds,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2012.
|Previously we described a method of calculating auditory excitation patterns and loudness for steady sounds, based on a nonlinear filterbank. Here the method is extended to deal with time-varying sounds. Firstly, the input waveform is transformed to short-term spectrum by a structure with six FFTs, which using longer signal segments for low frequencies and shorter segments for higher frequencies. Secondly, the excitation patterns are calculated from the short-term spectrum, and the summation of the excitation gives a value for the instantaneous loudness. Thirdly, the short-term loudness is calculated from the instantaneous loudness using an averaging mechanism similar to an automatic gain control system, with attack and release times. Finally the long-term loudness is calculated from the short- term loudness using a similar averaging mechanism, but with longer attack and release time. The method gives good predictions of loudness for both steady sounds and time- varying sounds.|
J. Serrà, Á. Corral, M. Boguñá, M. Haro & J. Ll. Arcos. "Measuring the Evolution of Contemporary Western Popular Music." Scientific Reports 2, Article number: 521 (2012). Doi:10.1038/srep00521
|Popular music is a key cultural expression that has captured listeners' attention for ages. Many of the structural regularities underlying musical discourse are yet to be discovered and, accordingly, their historical evolution remains formally unknown. Here we unveil a number of patterns and metrics characterizing the generic usage of primary musical facets such as pitch, timbre, and loudness in contemporary western popular music. Many of these patterns and metrics have been consistently stable for a period of more than fifty years. However, we prove important changes or trends related to the restriction of pitch transitions, the homogenization of the timbral palette, and the growing loudness levels. This suggests that our perception of the new would be rooted on these changing characteristics. Hence, an old tune could perfectly sound novel and fashionable, provided that it consisted of common harmonic progressions, changed the instrumentation, and increased the average loudness.|
N. B. H. Croghan, K. H. Arehart, and J. M. Kates. "Quality and loudness judgments for music subjected to compression limiting." Journal of the Acoustical Society of America, vol. 132, no. 2, pp. 1177–1188, Aug. 2012.
|Dynamic-range compression (DRC) is used in the music industry to maximize loudness. The amount of compression applied to commercial recordings has increased over time due to a motivating perspective that louder music is always preferred. In contrast to this viewpoint, artists and consumers have argued that using large amounts of DRC negatively affects the quality of music. However, little research evidence has supported the claims of either position. The present study investigated how DRC affects the perceived loudness and sound quality of recorded music. Rock and classical music samples were peak-normalized and then processed using different amounts of DRC. Normal-hearing listeners rated the processed and unprocessed samples on overall loudness, dynamic range, pleasant- ness, and preference, using a scaled paired-comparison procedure in two conditions: un-equalized, in which the loudness of the music samples varied, and loudness-equalized, in which loudness differen- ces were minimized. Results indicated that a small amount of compression was preferred in the un-equalized condition, but the highest levels of compression were generally detrimental to quality, whether loudness was equalized or varied. These findings are contrary to the “louder is better” mentality in the music industry and suggest that more conservative use of DRC may be preferred for commercial music.|
E. Deruty. "La Guerre du Volume: la Musique modifiée par ses canaux de diffusion." Conférence, Journée Science et Musique, Université Rennes 2, Oct. 2012.
Link to video. In French / en Français.
|Depuis trente ans, les chaînes de télévision et de radio se livrent à une guerre des niveaux sonores. A l’heure où cette escalade est réputée nuire à la qualité musicale, Emmanuel Deruty vous propose une explication de cette « guerre du volume », ses raisons et ses conséquences.|
D. Ward, J.D. Reiss,C. Athwal. "Multi-track mixing using a model of loudness and partial loudness." AES 133rd Convention, San Francisco CA., 26-29 Oct. 2012.
|A method for generating a mix of multitrack recordings using an auditory model has been developed. The proposed method is based on the concept that a balanced mix is one in which the loudness of all instruments are equal. A sophisticated psychoacoustic loudness model is used to measure the loudness of each track both in quiet and when mixed with any combination of the remaining tracks. Such measures are used to control the track gains in a time-varying manner. Finally we demonstrate how model predictions of partial loudness can be used to counteract energetic masking for any track, allowing the user to achieve better channel intelligibility in complex music mixtures.|
J. Laler. "Perceived Sound Quality of Dynamic Range Reduced and Loudness Normalized Popular Music." Bachelor of Arts Audio Engineering Luleå University of Technology Department of Business, Administration, Technology and Social Sciences, Dec. 2012.
|Compression is used in the music industry to change the level of a signal in order to make it better suited for a specific situation. This might be to control the natural dynamic changes of an instrument at the tracking session in order to make it fit later in the mix. Compression can also be used to increase the loudness of a signal and the development of software dynamic processing tools has undermined the so called “loudness war” . In broadcasting, the loudness war escalated when commercial interests discovered a loop hole in the frequency modulation technique used for radio and television. QPPM meters (quasi-peak program meters) generally has an integration time of 10 ms, making them unresponsive to shorter transients than this. Therefore, a headroom was necessary in order not to exceed the maximum level for broadcasting.|
International Telecommunication Union. "Recommendation ITU-R BS.1770-2 - Algorithms to measure audio programme loudness and true-peak audio level." Mar. 2011.
|This Recommendation specifies audio measurement algorithms for the purpose of determining subjective programme loudness, and true-peak signal level.|
The EBU has studied the needs of audio signal levels in production, distribution and transmission of broadcast programmes. It is of the opinion that an audio-levelling paradigm is needed based on loudness measurement. This is described in EBU Technical Recommendation R 128 . In addition to the average loudness of a programme (‘Programme Loudness’) the EBU recommends that the measures ‘Loudness Range’ and ‘Maximum True Peak Level’ be used for the normalisation of audio signals and to comply with the technical limits of the complete signal chain as well as the aesthetic needs of each programme/station depending on the genre(s) and the target audience.
In this document the properties of a loudness meter in the so-called ‘EBU Mode’ will be introduced and explained in detail. A set of test signals providing minimum requirements for compliance complements the document.
|The EBU has studied the needs of audio signal levels in production, distribution and transmission of broadcast programmes. It is of the opinion that an audio-levelling paradigm is needed based on loudness measurement. This is described in EBU Technical Recommendation R 128 . In addition to the average loudness of a programme (‘Programme Loudness’) the EBU recommends that the measures ‘Loudness Range’ and ‘Maximum True Peak Level’ be used for the normalisation of audio signals and to comply with the technical limits of the complete signal chain as well as the aesthetic needs of each programme/station depending on the genre(s) and the target audience.
In this document the measure ‘Loudness Range’ and the algorithm for its computation will be introduced and explained in detail.
The algorithm was kindly provided by the company TC Electronic.
|This document describes in practical detail one of the most fundamental changes in the history of audio in broadcasting; the change of the levelling paradigm from peak normalisation to loudness normalisation. It cannot be emphasized enough that loudness metering and loudness normalisation signify a true audio levelling revolution. This change is vital because of the problem which has become a major source of irritation for television and radio audiences around the world; that of the jump in audio levels at the breaks in programmes, between programmes and between channels (see footnote 2 for a definition of ‘programme’).
The loudness-levelling paradigm affects all stages of an audio broadcast signal, from production to distribution and transmission. Thus, the ultimate goal is to harmonise audio loudness levels to achieve an equal universal loudness level for the benefit of the listener.
It must be emphasised right away that this does not mean that the loudness level shall be all the time constant and uniform within a programme, on the contrary! Loudness normalisation shall ensure that the average loudness of the whole programme is the same for all programmes; within a programme the loudness level can of course vary according to artistic and technical needs. With a new (true) peak level and the (for most cases) lower average loudness level the possible dynamic range (or rather ‘Loudness Range’; see §2.2) is actually greater than with current peak normalisation and mixing practices in broadcasting.
The basis of the concept of loudness normalisation is a combination of EBU Technical Recommendation R 128 ‘Loudness normalisation and permitted maximum level of audio signals’  and Recommendation ITU-R BS.1770 ‘Algorithms to measure audio programme loudness and true-peak audio level’ .
E. Vickers. "The Loudness War: Do Louder, Hypercompressed Recordings Sell Better?" J. Aud. Eng. Soc., Vol. 59 Issue 5, pp. 346-351, May 2011.
|The term “loudness war” refers to the ongoing competitive increase in the loudness of commercially distributed music. While this increase has been facilitated by the use of dynamic range compression, limiting, and clipping, the underlying cause is the belief that louder recordings sell better. This paper briefly reviews some possible side effects of the loudness war and presents evidence questioning the assumption that loudness is significantly correlated to listener preference and sales ranking.|
E. Deruty. "'Dynamic Range' and the Loudness war." Sound on Sound (Sep. 2011).
|We all know music is getting louder. But is it less dynamic? Our ground-breaking research proves beyond any doubt that the answer is no — and that popular beliefs about the 'loudness war' need a radical rethink.|
Joël Girès. "La Loudness War, une évolution collective sans chef d'orchestre. Concurrence généralisée et transformation des chaînes de coopération dans le monde du disque." Mémoire SAE, 2011.
|La compétition professionnelle dans le monde du disque prend une allure surprenante : les producteurs de musique tentent d’accroître leur compétitivité en augmentant le volume sonore de leurs productions musicales. Cette guerre du volume est un révélateur des logiques professionnelles à l’œuvre dans l'industrie du disque. Elle met d'abord en évidence l'incertitude fondamentale de l'emploi musical, dont elle est le produit. Elle montre ensuite les rapports de domination dans le secteur, le standard technique à imiter étant la production hégémonique des majors. Elle indique également la nature des chaînes de coopération des membres du monde du disque, basées sur des mécanismes autoréférentiels. Elle rend visibles, enfin, les conventions esthético-techniques et leurs effets sur les contenus à travers le conflit entre les artistes et le personnel de renfort. Ce travail entend contribuer à une lecture originale, sociologique et compréhensive, du phénomène, ce dernier n'ayant fait l'objet à l'heure actuelle que de mesures quantitatives.|
J. Boley, M. Lester and C. Danner. "Measuring Dynamics: Comparing and Contrasting Algorithms for the Computation of Dynamic Range." AES 129th Conv. Nov. 2010.
|There is a consensus among many in the audio industry that recorded music has grown increasingly compressedover the past few decades. Some industry professionals are concerned that this compression often results in poor audio quality with little dynamic range. Although some algorithms have been proposed for calculating dynamic range, we have not been able to find any studies suggesting that any of these metrics accurately represent any perceptual dimension of the measured sound. In this paper, we review the various proposed algorithms and compare their results with the results of a listening test. We show that none of the tested metrics accurately predict the perceived dynamic range of a musical track, but we identify some potential directions for future work.|
E. Vickers. "The Loudness War: Background, Speculation and Recommendations." AES 129th Conv. Nov. 2010.
|There is growing concern that the quality of commercially distributed music is deteriorating as a result of mixing and mastering practices used in the so-called “loudness war.” Due to the belief that “louder is better,” dynamics compression is used to squeeze more and more loudness into the recordings. This paper reviews the history of the loudness war and explores some of its possible consequences, including aesthetic concerns and listening fatigue. Next, the loudness war is analyzed in terms of game theory. Evidence is presented to question the assumption that loudness is significantly correlated to listener preference and sales rankings. The paper concludes with practical recommendations for de-escalating the loudness war.|
E. Vickers. "Metrics for Quantifying Loudness and Dynamics." Extra material for the article"The Loudness War: Background, Speculation and Recommendations." AES 129th Conv. Nov. 2010.
|(This material was originally intended as part of the article “The Loudness War: Background, Speculation and Recommendations”  but was removed for reasons of scope and to keep that article to a manageable length.) In this paper, I briefly review a variety of metrics for quantifying the loudness and dynamic spread of audio recordings. This review was motivated by the need for objective ways of measuring the effects of hypercompression as used in the loudness war.|
D. Viney. "The Obsession With Compression: A Research Project Dissertation." London College Of Music, Faculty Of Arts, Thames Valley University, Dec. 2008.
The researcher is a mature post-graduate with a first career in the IT industry – he has been involved in music since childhood, initially as a performer (choral & instrumental) and subsequently as a composer & producer/engineer – he has composed, performed, produced & engineered a variety of rock/pop songs and currently sings in a cover band which gigs in pubs & clubs in aid of charity.
During his MA course, the author has been involved in a variety of work experience in the music business and has fully engaged with industry organisations (MPG, APRS, AES etc) – after submission of this dissertation, he will be considering the options of working within the music business or conducting further research.
The original inspiration for this project came from the „Audio Production Industry‟ module of the course taught by the researcher‟s supervisor and, as per the proposal, was „An investigation into correlations between certain musical & technical aspects of contemporary „popular‟ music and its commercial success in the UK‟.
E. Skovenborg and T. Lund. "Loudness Descriptors to Characterize Programs and Music Tracks." AES 125th Conv. Oct. 2008.
|We present a set of key numbers to summarize loudness properties of an audio segment, broadcast program or music track: the loudness descriptors. The computation of these descriptors is based on a measurement of loudness level, such as specified by the ITU-R BS.1770. Two fundamental loudness descriptors are introduced: Center of Gravity and Consistency. These two descriptors were computed for a collection of audio segments from various sources, media and formats. This evaluation demonstrates that the descriptors can robustly characterize essential properties of the segments. We propose three different applications of the descriptors: for diagnosing potential loudness problems in ingest material; as a means for performing a quality check, after processing/editing; or for use in a delivery specification.|
S. Michaels. "Death Magnetic 'loudness war' rages on."The Guardian (Oct. 2008).
|Fans sign online petition to get Metallica album remastered. Lars Ulrich responds by shutting his eyes, sticking his fingers in his ears and going 'la-la-la I can't hear you'|
E. Smith. "Even Heavy-Metal Fans Complain That Today's Music Is Too Loud!!!" The Wall Street Journal (Sep. 2008).
|They Can't Hear the Details, Say Devotees of Metallica; Laying Blame on iPods.
Can a Metallica album be too loud? The very thought might seem heretical to fans of the legendary metal band, which has been splitting eardrums with unrivaled power since the early 1980s. But even though Metallica's ninth studio release, "Death Magnetic," is No. 1 on the album chart, with 827,000 copies sold in two weeks, some fans are bitterly disappointed: not by the songs or the performance, but the volume. It's so loud, they say, you can't hear the details of the music. "Death Magnetic" is a flashpoint in a long-running music-industry fight. Over the years, rock and pop artists have increasingly sought to make their recordings sound louder to stand out on the radio, jukeboxes and, especially, iPods.
K. Masterson. "Loudness war stirs quiet revolution." Chicago Tribune (Jan. 2008).
|Bands have turned up volume to get noticed, audio engineers lead battle to crank it down|
A. von Ruschkowski. "Loudness War." In "Systematic and Comparative Musicology: Concepts, Methods, Finding. Schneider, Albrecht (ed.) 2008.
|The terms “Loudness War“, “Level War“ and “Loudness Race” describe the phenomenon of a constantly growing loudness of CDs containing popular music in the last two decades.1 The expression “war” indicates that these terms not only describe the phenomenon itself, but also the negative side effects of increasing loudness. The terms have their origin in web forums, professional journals and books concerning mastering and audio technology, where the topic is heavily discussed since approximately 1999.|
S. Sreedhar. "The Future of Music." IEEE Spectrum, Aug. 2007.
You're listening to your favorite Pink Floyd CD on your home stereo when you accidentally hit the ”change CD” button on the control panel. All goes quiet for a bit as your CD player urgently shifts to play whatever is in the next tray. With dread, you desperately reach for the volume knob, but it's too late--your speakers blast the latest Green Day album. Reacting like you were just pricked by a pin, your hand jolts to the volume knob and turns it down. You breathe a sigh of relief. But that's not the end of it. Ten minutes later you feel that something isn't right. Even though you love this album, you can't listen to it anymore. You shut it off, tired, puzzled, and confused. This always seems to happen when you switch from a classic album to a modern one. What you've just experienced is something called overcompression of the dynamic range. Welcome to the loudness war.
The loudness war, what many audiophiles refer to as an assault on music (and ears), has been an open secret of the recording industry for nearly the past two decades and has garnered more attention in recent years as CDs have pushed the limits of loudness thanks to advances in digital technology. The ”war” refers to the competition among record companies to make louder and louder albums. But the loudness war could be doing more than simply pumping up the volume and angering aficionados-it could be responsible for halting technological advances in sound quality for years to come.
R. Levine. "The death of high fidelity." Rolling Stone (Dec. 2007).
|David Bendeth, a producer who works with rock bands like Hawthorne Heights and Paramore, knows that the albums he makes are often played through tiny computer speakers by fans who are busy surfing the Internet. So he's not surprised when record labels ask the mastering engineers who work on his CDs to crank up the sound levels so high that even the soft parts sound loud.
Over the past decade and a half, a revolution in recording technology has changed the way albums are produced, mixed and mastered — almost always for the worse. "They make it loud to get [listeners'] attention," Bendeth says. Engineers do that by applying dynamic range compression, which reduces the difference between the loudest and softest sounds in a song. Like many of his peers, Bendeth believes that relying too much on this effect can obscure sonic detail, rob music of its emotional power and leave listeners with what engineers call ear fatigue. "I think most everything is mastered a little too loud," Bendeth says. "The industry decided that it's a volume contest."
M. Zemack. "Implementing Methods for Equal Loudness in Radio Broadcasting." Master of Science Thesis KTH - Skolan för Datavetenskap och kommunikation (CSC), 2007.
|Sound levels are perceived as a growing problem in radio and TV. Quite often, great variations in perceived sound level exist inside a single program or between adjacent programs. Today the broadcaster uses a plethora of media platforms, all with different listener groups. They all have one thing in common; they all want even perceived sound levels. How can a broadcasting company accomplish this? This objective can be achieved by intentional work in consecutive steps. The first step is to assimilate the latest research in this area. The second step is to choose the best measurement method. The third step is to implement this single measurement method in all steps of production. Training and support for all programme producing staff is a must. The fourth step is to implement an automatic gain measurement and correction feature in the metadata of the play out system. The fifth step is that the broadcast company must itself try to control as much as possible of the final dynamic processing. In this paper, the above steps are examined, and some recommendations, large and small, are proposed for Swedish Radio about how their broadcast chain may be improved so that better perceived sound levels are achieved. The methods of measurement that are tested in this report are both Leq(R2LB) and Replay gain. I have also compared the final dynamic processing systems at Swedish radio. Both of these measure methods and the final processing systems, Factum Cadenza together with Orban 8200 work very well. With the use of these tools, Swedish Radio can achieve more even perceived sound levels, which is important to keep and obtain new listeners.|
B. R. Glasberg and B. C. J. Moore. "Development and Evaluation of a Model for Predicting the Audibility of Time-Varying Sounds in the Presence of Background Sounds." J. Audio Eng. Soc., vol. 53, pp. 906-918 Oct. 2005.
|A model for predicting the audibility of time-varying signals in background sounds is described. The model requires the calculation of time-varying excitation patterns for the signal and background, using the methods described elsewhere. A quantity called instanta- neous partial loudness (IPL) is calculated from the excitation patterns. The estimates of IPL, which are updated every 1 ms, are used to calculate the short-term partial loudness (STPL) using a form of running average similar to an automatic gain control system. It is assumed that the audibility of the signal is monotonically related to the average value of the STPL over the duration of the signal. In experiment 1 thresholds were measured for detecting a 1-kHz sinusoid in four different samples each of white and pink “frozen” noise. The results were used to determine the average value of the STPL required for threshold. In experiment 2 the model was evaluated by measuring detection thresholds for nine signal types in six back- grounds (54 combinations), using a two-alternative forced-choice task. The backgrounds were chosen to be relatively steady (such as traffic noise). The correlation between the measured masked thresholds and those predicted by the model was 0.94. The root-mean- square difference between the thresholds obtained and those predicted was 3 dB. In experi- ment 3 psychometric functions were measured for the detection of five signals in five backgrounds (five pairs), using a two-alternative forced-choice task. Experiment 4 used the same signals and backgrounds, but psychometric functions were measured using a single- interval yes–no task. The results of experiments 3 and 4 were used to construct functions relating signal detectability d2 to the average value of the STPL.|
E. Skovenborg and S. H. Nielsen. "Evaluation of Different Loudness Models with Music and Speech Material." 117th AES Convention, San Francisco, CA, USA, October 28–31, 2004.
|The evaluation of twelve models of loudness perception is presented. One of the loudness models is based on a novel algorithm, and another is based on a combination of two known measurement techniques. The remaining models are all implementations of common or standardized loudness algorithms. The ability of each model to predict or measure the subjective loudness of speech and music segments is evaluated. The reference loudness is derived from two listening experiments using the speech and music segments as stimuli. Different statistical measures are employed in the evaluation of the models, so that both the absolute performance of the models and the performance relative to the between-listener disagreement are measured.|
ISO. "Normal equal-loudness-level contours." Technical Report 226, International Standard Organisation, 2003.
|This International Standard specifies combinations of sound pressure levels and frequencies of pure continuous tones which are perceived as equally loud by human listeners. The specifications are based on the following conditions: the sound field in the absence of the listener consists of a free progressive plane wave; the source of sound is directly in front of the listener; the sound signals are pure tones; the sound pressure level is measured at the position where the centre of the listener's head would be, but in the absence of the listener; listening is binaural; the listeners are otologically normal persons in the age range from 18 years to 25 years inclusive.|
B.C.J. Moore, B.R Glasberg and M.A.Stone, "Why are Commercials so Loud? - Perception and Modeling of the Loudness of Amplitude-Compressed Speech." J. Audio Eng. Soc., Dec. 2003.
|The level of broadcast sound is usually limited to prevent overmodulation of the transmitted signal. To increase the loudness of broadcast sounds, especially commercials, fastactingamplitude compression is often applied. This allows the root-mean-square (rms) levelof the sounds to be increased without exceeding the maximum permissible peak level. Inaddition, even for a fixed rms level, compression may have an effect on loudness. To assess whether this was the case, we obtained loudness matches between uncompressed speech(short phrases) and speech that was subjected to varying degrees of four-band compression.All rms levels were calculated off line. We found that the compressed speech had a lower rmslevel than the uncompressed speech (by up to 3 dB) at the point of equal loudness, whichimplies that, at equal rms level, compressed speech sounds louder than uncompressed speech .The effect increased as the rms level was increased from 50 to 65 to 80 dB SPL. For the largest amount of compression used here, the compression would allow about a 58% increasein loudness for a fixed peak level (equivalent to a change in level of about 6 dB). With a slightmodification, the model of loudness described by Glasberg and Moore  was able to account accurately for the results.|
B. R Glasberg and B. C. J. Moore. "A Model of Loudness Applicable to Time-Varying Sounds." J. Audio Eng. Soc., vol. 50, pp. 331-342, May 2002.
|Previously we described a model for calculating the loudness of steady sounds from their spectrum. Here a new version of the model is presented, which uses a waveform as its input. The stages of the model are as follows. (a) A finite impulse response filter representing transfer through the outer and middle ear. (b) Calculation of the short-term spectrum using the fast Fourier transform (FFT). To give adequate spectral resolution at low frequencies, combined with adequate temporal resolution at high frequencies, six FFTs are calculated in parallel, using longer signal segments for low frequencies and shorter segments for higher frequencies. (c) Calculation of an excitation pattern from the physical spectrum. (d) Transformation of the excitation pattern to a specific loudness pattern. (e) Determination of the area under the specific loudness pattern. This gives a value for the "instantaneous" loudness. The short-term perceived loudness is calculated from the instantaneous loudness using an averaging mechanism similar to an automatic gain control system, with attack and release times. Finally the overall loudness impression is calculated from the short-term loudness using a similar averaging mechanism, but with longer attack and release times. The new model gives very similar predictions to our earlier model for steady sounds. In addition, it can predict the loudness of brief sounds as a function of duration and the overall loudness of sounds that are amplitude modulated at various rates.|
E. Vickers. “Automatic Long-term Loudness and Dynamics Matching.” AES 111th convention, NYC, USA, Nov. 2001.
|Traditional audio level control devices, such as automatic gain controls (AGCs) and compressors, generally have little or no advance knowledge of the dynamic characteristics of the remainder of the current audio program. If such advance knowledge is available (i.e., if audio files can be pre-analyzed), it becomes possible to match desired values of overall loudness and dynamics. We introduce two new measures, “long-term loudness matching level” and “dynamic spread,” and present new methods for long-term loudness and dynamics matching.|
E. Zwicker and H. Fastl. Psychoacoustics Facts and Models. 2nd updated edition. Berlin/Heidelberg: Springer-Verlag, 1999.
|Psychoacoustics – Facts and Models offers a unique, comprehensive summary of information describing the processing of sound by the human hearing system. It includes quantitative relations between sound stimuli and auditory perception in terms of hearing sensations, for which quantitative models are given, as well as an unequalled collection of data on the human hearing system as a receiver of acoustic information. In addition, many examples of the practical application of the results of basic research in fields such as noise control, audiology, or sound quality engineering are detailed. The third edition includes an additional chapter on audio-visual interactions and applications, plus more on applications throughout. Reviews of previous editions have characterized it as "an essential source of psychoacoustic knowledge," "a major landmark ," and a book that "without doubt will have a long-lasting effect on the standing and future evolution of this scientific domain.|
B. C. J. Moore, B. R. Glasberg, and T. Bae. "A Model for the Prediction of Thresholds, Loudness, and Partial Loudness." J. Audio Eng. Soc., vol. 45, pp. 224-240, Apr. 1997.
|A loudness model for steady sounds is described having the following stages: 1) a fixed filter representing transfer through the outer ear; 2) a fixed filter representing transfer through the middle ear; 3) calculation of an excitation pattern from the physical spectrum; 4) transformation of the excitation pattern to a specific loudness pattern; 5) determination of the area under the specific loudness pattern, which gives overall loudness for a given ear; and 6) summation of loudness across ears. The model differs from earlier models in the following areas: 1) the assumed transfer function for the outer and middle ear; 2) the way that excitation patterns are calculated; 3) the way that specific loudness is related to excitation for sounds in quiet and in noise; and 4) the way that binaural loudness is calculated from monaural loudness. The model is based on the assumption that sounds at absolute threshold have a small but finite loudness. This loudness is constant regardless of frequency and spectral content. It is also assumed that a sound at masked threshold has the same loudness as a sound at absolute threshold. The model accounts well for recent measures of equal-loudness contours, which differ from earlier measures because of improved control over bias effects. The model correctly predicts the relation between monaural and binaural threshold and loudness. It also correctly accounts for the threshold and loudness of complex sounds as a function of bandwidth.|
J. M. Geringer. "Continuous Loudness Judgments of Dynamics in Recorded Music Excerpts." Journal of Research in Music Education,Vol. 43, No. 1 (Spring, 1995), pp. 22-35.
|This study was designed to investigate loudness judgments of musician and nonmusician listeners in response to performed dynamic changes within a musical context. Ten previously recorded music excerpts selected from diverse examples of music served as stimuli. Subjects listened individually and responded continuously during music examples using the Continuous Response Digital Interface (CRDI) to indicate perceived loudness levels. A three-way analysis of variance revealed that musician subjects indicated a significantly smaller magnitude of dynamic change than did nonmusician subjects. Crescendos were judged as having a significantly greater magnitude of change than decrescendos. There were also differences between the individual excerpts. The obtained relationships between the subjective magnitude of loudness change and the physical magnitude of intensity change were compared to those found in the psychoacoustical literature. Music stimuli in context were perceived somewhat differently than were the pure tone and noise-band stimuli of previous research.|
E. Zwicker. "Procedure for Calculating Loudness of Temporally Variable Sounds." J. Acoust. Soc. Am., vol. 62, no. 3, pp. 675-682, 1977.
|Data on loudness comparisons are complemented by additional measurements. Thus, temporal effects in loudness with respect to four parameters can be summarized: (a) phase effects of complex tones and the influence of physiological noise, both in the low‐frequency range, (b) effects of amplitude modulation, (c) frequency modulation, and (d) bandwidth in the high‐frequency range. Additionally, transient masking patterns and the corresponding specific loudnesspatterns produced by strongly time‐varying sounds are discussed. A model for the loudness development of such sounds is designed and realized electronically as a loudnessmeasuring device. The usefulness of this equipment is demonstrated for measuringloudness of the following six types of sounds that vary both temporally and spectrally: (a) single tone bursts as a function of duration, (b) sinusoidally amplitude modulated tones as a function of repetition rate, (c) bandpass noise with constant rms value as a function of bandwidth within the critical band, (d) strongly frequency modulated tones as a function of modulation frequency, (e) temporally partially masked tone bursts, and (f) continuous speech.|
A. M. Richards. "Monaural loudness functions under masking." Journal of the Acoustical Society of America, 44(2):599–605, Mar. 1968.
|Monaural sone functions are obtained for a no noise condition and under five levels of masking noise using the method of fractionation This method precludes the use of both ears in obtaining such functions as has been the case with dichotic loudness balance and other related procedures. The obtained curves are found to parallel previously found masked functions in one case and in another to show a more rapid acLeleration at low levels but identical slopes above one sone When the power function exponent of a 1000 Hz tone is plotted against overall SPL of a masking noise a power transformation which parallels that found for speech in noise is obtained Although no numencal calcula tions are presented it appears that above 60 dB of noise the exponent grows as approximately the 0 16 power of the noise.|
D W Robinson and R S Dadson. "A re-determination of the equal-loudness relations for pure tones." British Journal of Applied Physics, Volume 7 Number 5, p. 166, 1956.
The paper describes a new determination of the equal-loudness relations for pure tones in free-field conditions which has been carried out at the National Physical Laboratory as a result of requests from organizations interested in various aspects of the acoustics of hearing. The equal-loudness contours are of considerable importance in this field, being fundamental to a proper understanding of aural judgments of the loudness of sounds of all kinds. They are also concerned in numerous practical applications in the study of noise.
The first set of contours for free-field conditions was given by Fletcher and Munson in 1933, and a second determination was carried out by Churcher and King in 1937, but these two investigations showed considerable discrepancies over parts of the auditory diagram. The present work has been carried out on a more extensive scale, using a large team of otologically normal persons, and new techniques have been introduced enabling reliable measurements to be made over a wider range of intensity than has hitherto been possible.
The new results cover a range of frequency of from 25 to 15 000 c/s and of sound pressure level up to about 130 dB relative to 0.000 2 dyn/cm2. The data show a greater degree of regularity than the former results, and allow the equivalent loudness of a pure tone of any frequency to be expressed by formulae quadratic in the sound pressure level, the coefficients varying smoothly with frequency. The results include a new determination of the normal threshold of hearing in free field, which is highly consistent with the equal-loudness contours. At frequencies above 1000 c/s account needs to be taken of variations due to the age of the observers, which become of particular importance at the upper end of the frequency range.
The new results have indicated the causes of some of the discordant features in the earlier determinations and it is hoped that the work will facilitate agreement on a standard set of equal-loudness contours. On account of its relevance to noise measurement, some extension to equal-loudness relations for bands of noise is being undertaken.
H. Fletcher and W. A. Munson. "Loudness, Its Definition, Measurement and Calculation." Bell System Technical Journal, Volume 12, Issue 4, pages 377-430, Oct. 1933.
|An empirical formula for calculating the loudness of any steady sound from an analysis of the intensity and frequency of its components is developed in this article. The development is based on fundamental properties of the hearing mechanism in such a way that a scale of loudness values results. In order to determine the form of the function representing this loudness scale and of the other factors entering into the loudness formula, measurements were made of the loudness levels of many sounds, both of pure tones and of complex wave forms. These tests are described and the method of measuring loudness levels is discussed in detail. Definitions are given endeavoring to clarify the terms used and the measurement of the physical quantities which determine the characteristics of a sound wave stimulating the auditory mechanism.|
Something you want to add? mailto e at emmanuelderuty.com