I recently found a small neon sign on Ebay made by RCA in the early 1930’s. It guaranteed “Sound Satisfaction” for any moviegoer because RCA equipment was in the house. That was a time when sound really mattered. I lost the auction but it did get me thinking on the subject of what sound quality means today and what’s happened since that sign was made.

It’s impossible to understand the impact of recorded sound on human civilization. It’s a little more than a hundred years ago since you had to be there to hear music. Sound was made and consumed instantly. It left no trace except as tradition and written scores. The technology to capture and reproduce sound was like magic, as are all great technological innovations, as Arthur C. Clarke famously noted.

The man who really started the revolution in sound reproduction was Thomas A. Edison. His first device, a wax cylinder, was not very good, but later he moved into disc recording, which is what we now know as records, vinyl, LP’s, etc. To convince people to buy his new technology, Edison hit upon the brilliant idea of doing thousands of live demonstrations across the US, with a performer singing versus a machine playing back the same recording on disc. This was quite successful. Many people could not tell the difference, even though if you heard this comparison today you would find it impossible to believe. How did he manage to fool people?

One way involved double blind testing- meaning the listeners were blindfolded and the blindfolds were wide pieces of cloth that covered not just their eyes but also their ears. Another much more devious method was to have the singers listen to their records, which were pretty poor in quality, and then sing in that poor quality (I really can’t fathom how they did that but I’m no performer). Edison chose music carefully, with nothing challenging in terms of dynamics, detail or complexity. Edison himself was hearing impaired, almost deaf. Even more shrewd, as Sean Olive (seanolive.blogspot.com) notes in his piece on the history of live vs. recorded, Edison thought that-

“People will hear what you tell them to hear”. [1]

“The expectations and perceptions of his listeners were manipulated before the test to produce a more predicable outcome. Audience members were given a concert program before his Tone Tests that clearly told them exactly what they would hear, how amazing it will sound, and what an appropriate response would be.”

Plus ça change…

Another interesting facet to the live vs. recorded demonstrations is that usually the performer would be faking (lip synching) over the recorded section of the presentation, and this brings the McGurk effect into play:

https://www.youtube.com/watch?v=G-lN8vWm3m0

Your visual sense effectively overrides what your brain is telling you is the truth when it comes to sound. Humans have incredibly good hearing, as recent important scientific studies have shown (see my previous blog entry Second Time Around) especially in the most important areas of our frequency range, like the human voice, or an infant crying. So you are probably wondering how it’s possible that we can be fooled so easily into thinking we’re hearing something we’re not, or that it’s any good. The study of how humans hear, as opposed to how a microphone hears (a very important distinction, as a microphone does not have a brain) is called psychoacoustics, and it explains a lot of these questions.

At the beginning of recorded sound, the scientific emphasis was on both understanding human hearing and how to make the best possible sound. There was a lot of money involved, think the telephone system, radio and talking motion pictures- i.e. movies. For the phone system, there were minimum thresholds for reproduction, intelligibility for example. It’s funny that today I often have a hard time understanding what someone is saying on my iPhone 6, because the system carriers do everything they can to reduce bandwidth and increase profit, thus sound quality remains the same despite technological progress- barely good enough.

For the last four decades or so, a lot of psychoacoustic inquiry has been focused on areas of our hearing that allow us to be fooled. Our hearing is incredibly sophisticated- it’s adaptive, and actually changes depending on the situation you’re in. For example, in a crowded room, like a cocktail party, if you want to hear what someone is saying at a distance, you are able to filter out surrounding chatter, and your sense of loudness changes as well (this is actually known as the “cocktail party effect”).

Psychoacoustic researchers made many discoveries about how the ear and brain perceive loudness and the arrival of sounds of different intensities, in particular a phenomenon known as masking. We hear sounds actively, meaning our brain does not hear a loud sound or transient the same way in a quiet room or in a very loud or noisy environment, and if a quiet sound is followed by a loud one, it will mask the earlier sound (and vice versa). Understanding masking led to ways to “fool” the brain into thinking that a sound was present when it was not. Without knowledge of masking the MP3 would be impossible. Other research concentrated on the limitations of human hearing, showing that we could not hear past 20Khz, for example, and it’s true, we can’t, at least not consciously. This was very important for the development of early digital sound technology and the CD, which has a brick wall filter at 20kHz so effectively shuts out everything higher in frequency. The MP3 came hot on its heels when a group of German academics figured out how to get rid of most of the music in music. The MP3 algorithm is a 1:12 ratio- only one part in twelve of an original music file is retained, the rest is thrown away as being “redundant.” This was all verified by tests using snippets of music. The problem is no one listens to snippets of music.

All this research was based on the underlying question: “How bad can we make sound before anyone will notice?” The emphasis in our culture for the last 50 years has been to make sound and music cheaper, more convenient and ubiquitous, and like fast food, has created an epidemic of obese hard drives and overall malnutrition in sound quality.

We’ve gotten to the point where any sound better than the earbuds that Apple gives you or the 2” drivers in your TV or computer speaker is now considered high fidelity. When RCA coined that term in the 1930’s it meant twin 15” field coil woofers on a 7’ wide horn for bass, and a pair of field coil compression drivers firing into an 18 cell horn that weighed about 200 lbs. Times have changed…

[1] Andre Milliard, “Edison’s Tone Tests and the Ideal of Perfect Sound Reproduction,” from Lost and Found Sounds’, NPR. Program for Edison Demonstration.

Blog

I Can’t Get No…