Hello everyone. This is the first post in a three post series about basic stereo recording techniques. The series is made of a few chapters, which have been split across three posts. Some will be longer than others, but it should be a nice way to present the topic as you can bookmark specific chapters, if you like.
This post describes how the stereo image is created and how it relates to our hearing systems. Post 2 highlights a number of sonic aspects of the stereo image which should be considered when recording and also when mixing. In terms of recording, Post 3 outlines the different categories of stereo recording arrays and some examples.
Overall, this guide aims to take readers through the creation, manipulation and recording of the stereo image across a variety of musical genres. It should be treated as a primer. Recommended reading will be suggested throughout the series. If you have any questions or you spot any errors, do not hesitate to get in touch with me at firstname.lastname@example.org.
More often than not, audio productions will be created for stereo playback. The stereo image is created by the speakers in a playback system such as your television, car radio, headphones or high end audio production set ups. The ideal stereo listening arrangement is shown in Figure 1‑1 where the distance between each speaker is the same as the distance from each speaker to the listener.
where A = B = C
Figure 1‑1 – Ideal Stereo Listening Position
To place a sound in the centre of the image, the sound is played equally from each of the two speakers. To make the sound come from the left, the amplitude of the sound is increased in the left speaker and attenuated in the right. A sound which is fully panned to the left will not come out of the right speaker at all. This concept of amplitude based panning is shown in Figure 1‑2 where three sources are panned centre, mostly left and totally right. The perceived position of the panned sounds is represented by the coloured dots.
Figure 1‑2 – Sound Source Placement when Panning
It is important to note the existence of pan-pot laws. When a sound is played in the full left position, 100% of its signal is being played out the left loudspeaker. If panning simply sends the signal to both loudspeakers, 100% of the signal will be played out of each loudspeaker (centre). The acoustical summing would result in up to a 6dB rise in SPL at the listening position. This would mean a sound which is panned from full left to full right would sound much louder in the middle.
Pan-pot laws deal with this by attenuating the signal in the analogue circuitry of the mixing console and in their DAW equivalents. As the pan-pot is adjusted from full left to centre, the circuitry will attenuate the signal more and more until it reaches the centre. As the pan-pot is turned from centre to full right, the attenuation is lessened until there is none at the full right position. The pan-pot law allows for fader levels and perceived listening levels to remain the same, regardless of the pan-pot position.
Where the above is how multi-tracked audio is panned, a variety of factors automatically pan and place the sounds to the left, through to the right when recording using a stereo recording technique. The selection and implementation of stereo recording techniques are exceptionally important, as the automatic nature of the panning means mistakes can rarely, if ever, be fixed in the mix.
The human hearing system uses two concepts to process a sounds directional cues, which are pieces of information in a sound which our brains use to process where a sound is coming. This information can be used to fool listeners of a recording into thinking that there are sound sources in front of them, coming from a variety of directions.
As an example, consider an emailing system. When you send an email, the text is converted into digital data, which would generally look like gibberish if you were to read it. When the email reaches the recipient, their email client will convert the gibberish back in to text so it can be read, in the exact way it was intended by the writer.
By recording the information sent to microphones by the musicians, the information can be stored and replayed over loudspeakers at a later date in the expectation that it sounds just like what it would if you were present for the recording in the first place.
The key to this ‘musical email’ is the sonic information which is recorded by the two microphones. Where a single microphone can be seen as ‘just’ recording the instrument and the space which the instrument is in, two microphones record the relationship between the sound sources and each of the microphones, just like our ears do. This information is then replayed over loudspeakers, which gives us a similar impression of the sound as if we were in the room.
The Interaural Time Difference is the temporal difference between sound reaching one ear and the other. All sound will take a length of time to travel from A to B. In the case of humans, A and B are 17 to 18cm apart, which is the average distance between ears, plus the distance the sound would have to travel around the listeners head to reach the other ear. The brain will process the timing differences to derive directional cues.
Identical arrival times for both ears will be processed such that the sound is either in front of or behind the listener. Arrival times which are only slightly different, will be only slightly left or right of centre and so on until sounds are hard left or right. It should be noted that the directional processing includes analysing the phase relationship of the sound between the two ears which means the ITD works up until approximately 700Hz.
The Interaural Intensity Difference is the difference is amplitude of a sound reaching each ear. Simply speaking, the very fact that there is a head in between the ears means that sound is attenuated and manipulated by the head itself, as much as 20dB. The brain uses this sound intensity difference as a way to determine a sounds direction by analysing where the sound is loudest.
Identical amplitude levels will be processed such that the sound is either in front of or behind the listener. Amplitude levels which are only slightly different, will be only slightly left or right of centre and so on until sounds are hard left or right.
It should be noted that in order to significantly attenuate a frequency with an object, the object must be approximately two thirds the length of the wave being attenuated. As the human head is around 17 to 18 centimetres, the IID will lose effectiveness below approximately 600Hz. Now it can be seen that the two Interaural differences cross over each other but mostly operate in separate ranges of the audio spectrum.
For a sound directly in front or behind a listener, the time and intensity differences would not provide useful information to determine direction. The brain uses an additional method to help determine direction.
Sound which comes from in front of us will be manipulated in a very specific way compared to sound coming from behind. The shape and size of our ears will cause filtering which our brain becomes very familiar with over the course of childhood and a lifetime. How the sound reacts with our face will also sound different to how it reacts to our hair, or lack of. Even our upper bodies can have a significant influence on how sound is perceived.
A simple and natural way of dealing with front and rear sounds is to tilt or move your head in some way to force the Interaural differences to become more dominant, which is very like what animals do when presented with an intriguing sound. A dog’s ability to move their ears is also a very powerful way of analysing sounds, without taking their eyes off a target or danger.
With the Interaural differences coupled together, the human hearing system is a remarkably powerful microphone system which supplies signals to our brains, which then manipulates the signals into something we can perceive in a variety of ways. One way to show how sensitive and powerful our hearing system is would be to consider that 24 photos per second are required in order to trick our eyes that there is movement on a television screen where our ears would require at least 40,000 ‘frames’ per second, or samples, in order for our ears to perceive digital sound properly.
Stereo recording techniques make use of one or both of the Interaural concepts, which means that engineers deal with the core listening principles of a very powerful hearing system. Depending on which concept the recording techniques uses primarily, the type of sound they produce will differ, sometimes dramatically. It is these differences that engineers use to choose and justify one technique over another for a given recording session.
1.2 The Next Post
The next post will deal with what the stereo image is and outline a number of criteria or image characteristics which musicians and engineers should appreciate when creating a stereophonic production.
If the hearing concepts outlined in this post interested you, I recommend the ‘Acoustics and Psychoacoustics’ textbook by Professors Jamie Angus and David Howard.