Skip to content

Mixing With Your Brain

Share this Post:

Psychoacoustics and the Mix Process

Our brains are adept at ignoring enormous swaths of visual and auditory information as we process the world. We are so used to this pre-processing that it likely rarely occurs to us that we are missing anything. And yet we are great at ignoring many acoustic problems. Reflections, comb filters, level differences, etc. are smoothed over by our brains.

That meat computer in our heads does a tremendous amount of subconscious work to help us understand our world. Most people are familiar with a number of optical illusions that illustrate how what we see is our brains’ carefully curated version of the world around us. The same is equally true for audio, but we are less aware of how the brain polishes up the sounds that we perceive.

While many individuals in pro audio started behind an instrument on stage, I was exposed to the industry from the technical side from the beginning. As a result, I had to build the vocabulary for activity behind the mixing board from a decided non-artistic place. I know musicians and technicians alike that have a totally intuitive approach to the sonic canvas, but I did not have this luxury. Over time, I found that learning about the various psychoacoustic mechanisms has helped me think more clearly when it came time to mix, and this month I’d like to share some of that collected wisdom with the FRONT of HOUSE readership. In this article I will briefly introduce a psychoacoustic concept and then discuss how it influences pushing faders.

Judging Levels

Mixing is stacking up the sonic footprint of various sources in a way that fits the event and artistry. The most basic layer of this cake concerns the levels of the various sources. One would think that the question of “which source is louder” would be simple to answer, but human hearing throws a monkey wrench into this determination. The psychoacoustic wrinkle is that our brains respond to average level, not peak level. This places the mixer in the somewhat strange position by improving the audibility of a source by reducing the peak levels of the signal before raising the average volume. This is the core principle behind the success of using compression in mixing. Shaping the volume of the loudest peaks allows the overall average volume level to be increased. This improves audibility of overtones and other quieter details. If pushing the fader isn’t giving a source the “cut” you desire, add some compression to damp down the peaks, and then try adjusting the source volume in the mix.

Audibility Threshold

A second wrinkle to the perception of level is that our ears are sensitive not only to the intensity of a sound but also its duration, at least for transient signals (e.g. music). This means that the ear is sensitive to the overall energy of a signal, not merely its instantaneous intensity. When considering the threshold of audibility for a signal, this effect is known as temporal summation. This concept dovetails nicely with the ear’s perception of volume based on average level, as a longer signal with high average level will contain more energy than a short transient signal with a high peak amplitude, but low overall energy content. A practical way that one can bring this effect into play is by using narrow equalization filters at high frequencies. These filters will ring in the time domain, lengthening the decay time of the signal and helping listeners perceive the “air” of the signal.

Add that Fundamental

Another curiosity of the human auditory system is that it will perceive the presence of a fundamental frequency when only the overtones are present. There are plug-ins (e.g., Waves MaxxBass) and standalone units that actively utilize this behavior to give the impression of more low-end in a signal than is actually present. While I have found that is a tricky one to implement with channel strip equalization, the guiding principle of using the overtones to bolster the fundamental’s presence, rather than simply boosting the fundamental, is a useful technique in mixing.

Stereo Image Tricks

The brain uses a mixture of timing cues and frequency cues to determine where sound arrives from. At lower frequencies the timing cues matter, and at higher frequencies amplitude cues matter. The amplitude cues rely on the fact that our earlobes shape the frequency response of sounds that arrive from the front differently than those that arrive from the rear. Reverbs, doublers and all other sorts of effects make use of these behaviors to add width and depth to mixes. A useful trick in mixing when the P.A. coverage and/or spacing doesn’t lend itself to stereo panning via volume level is to instead pull a source to the left or right side of the stereo image by using delay. That way, the source is present for all areas of the audience, but will seem panned for those areas where both sides of the P.A. provide coverage. Delay times under 20ms are usually plenty to make this trick work.

Masking — The Big One

The bits above are merely appetizers for the discussion of auditory masking, a topic that could easily be its own article. Masking refers to a process by which one sound renders another sound less audible or inaudible. Masking can occur when a loud sound blocks a quieter one, when two sounds appear at similar frequencies, or when two sounds appear at similar points in time. A later arriving sound can even mask an earlier sound. Masking behavior is fairly well understood, and is used extensively in the creation of lossy audio files (e.g., AAC or MP3). These file formats use the known behavior of the human auditory system to selectively control the quality of data storage of sound elements based on how audible the storage artifacts will be. The music acts as a masking agent for the storage artifacts.

Much of the practice of mixing is developing intuition for how the sources will mask each other in the context of the overall mix. Mixing stops being about merely presenting the individual sources with no concessions to the whole, and instead becomes an entirely different exercise. The intuitive mix engineers I know grasped masking effects almost immediately, even if they don’t use engineering language to describe what they experience. But for more technically minded people (like myself) understanding masking can be a revelation.

Masking explains why a collection of sources that individually sound rich and full are a complete mess when all the faders are up. Masking explains why the bass player and guitar player keep turning up their respective amps at practice. Masking can also explain why when someone says “more vocals” in the monitor, they often mean “less piano.” Masking explains why arrangement is so important to musical composition, and why dense, wideband instruments (e.g., electric guitar) are often panned to the edges of a mix.

Some Key Issues to Consider

What are some practical takeaways for mixing that result from this all-important masking?

•   Shaping source tone in isolation is often a fool’s errand. Spend a lot of time to get a tone that sounds interesting soloed, only to have it wash out complete in the mix. The solo button is for line check and emergencies, not for regular mixing.

•   It is okay to have instruments sound “weird” by themselves, with elements either overly prominent or restrained to make them balance in the overall picture.

•   Mixing is about removing elements that don’t contribute to the central motif of a given source. Reduce the palette of one source for the benefit of the whole mix.

•   Not all sources need to be prominent at all times.

•   Masking can be your friend, exposing or hiding things with a small move of the fader. This is especially useful on sources that accentuate the overall mix, but are not consistently front and center.

•   Learning to hear for mixing is as much about identifying the tones of each source as it is figuring out how they step on each other.

•   Instrument tone cannot be a purely static entity independent of the dynamics of the music. As the intensity ebbs and flows, different sources will crowd each other out
in different ways.

•   There is nothing wrong with aggressive, high-touch mixing if that is what the sonics in question require to insure sources are not tripping over each other.

Grasping that mixing is more than about hearing the individual sources, but also how they step on each other’s toes was a personal revelation. It opened up doors to do things that were “wrong” to my technical bent, and explained why these things were effective.

Closing Thoughts

One additional important psychoacoustic effect not taking prominence here is the perception of tone changing with different volumes. This, of course, complicates everything above regarding masking. It means that the frequencies and influence of masking will change dependent on the playback level. Sources that are separated clearly at one volume may no longer be distinct as the dynamics of the performance and mixing ebb and flow.

As long as we are dealing with the human hearing process — which even with the best ears, exhibits something far from a “flat” response — masking will be with us. Being mix engineers and audio pros, it behooves us to be aware of the effects of sound masking in every form. These can range from the simplest example, such as the obnoxious buzz from that cheap turned-up-to-11 guitar stompbox that’s disdainful to hear while the stage is quiet, yet is magically masked when the full band kicks in. Or masking can be infinitely more complex, involving room-to-ear interactions, cancellation/summation of various frequencies in different locations and/or the ear-to-brain connection and how we ignore or choose to home in on different sounds.

We could spend the next year discussing such concepts in detail and still barely scratch the surface. However, the key point here is to be aware of the effects of masking and occasionally look at mixing from a slightly different viewpoint, where — like cooking — sometimes it’s what you don’t put in the recipe that creates the magic.

 

The now-classic Fletcher-Munson curves (shown here) demonstrated the non-linearity of human hearing.Fletcher-Munson Loudness Curves: The Other Side of How We Hear

Some 82 years ago, Harvey C. Fletcher and Wilden A. Munson—two Bell Labs engineers studying various aspects of subjective loudness — forever changed the way in which the world understands the hearing process. Their research asked a large number of subjects to compare the relative volume of two tones to a standard 1 kHz tone at a set level. Averaging the results collected from the group, Fletcher and Munson defined of human hearing awareness at various frequencies.

In a landmark paper published in the October, 1933 edition of the Journal of the Acoustical Society of America, Fletcher and Munson showed that hearing is frequency selective, more specifically, hearing is most sensitive to pure tones in the 3,000 to 4,000 Hz range and less so above and below that. to perceive that a 100 Hz signal is of equal loudness to a 3,000 Hz tone, requires an actual SPL of the 100Hz tone that’s much higher than that of the 3kHz tone, particularly at low volumes.

The phenomenon was referred to as “Equal-Loudness Contours,” and although this original research was later updated and refined (most notably by D. W. Robinson and R. S. Dadson in 1956), Fletcher and Munson’s pioneering work laid the groundwork for creating industry-standard measurement curves, from the classic A/B/C/D-weighting filters to the current ISO 226:1987 standard.

On a side note, Wilden Munson continued his acoustical research at Bell Labs until retiring in 1962. Harvey Fletcher worked on a number of projects at Bell Labs, including the development of a vacuum tube-based hearing aid and helped found the Acoustical Society of America, serving as its first president. But even today, an entire industry owes a debt of gratitude to these audio pioneers whose names will live on forever.

—George Petersen