Multi-microphone recording arrays use two or more microphones for the recording of acoustic musical sources for stereo and surround sound productions. Many configurations are available to the recording engineer. The basic layout of a Multi Microphone Array is to use one microphone per speaker in the desired playback system.
A two microphone stereo array would be implemented in a similar way to the orange and green segments of Figure 1.1 where the left and right microphone signals are routed directly to the left and right playback channels respectively.
Some stereo arrays make use of a third microphone placed in the centre to improve the phantom centre imaging and stereo to mono compatibility (Eargle, 2005, p. 173). This implementation is denoted in Figure 1.1 by the dotted blue line where the centre microphone signal is routed in equal amounts to the left and right playback channels. In 5.1 surround sound production, the centre microphone would instead feed the dedicated centre loudspeaker which is denoted by the solid blue line. This centre signal is also used to contribute to centre image stability (ITU, BS.775-3, 2012, p. 7).
By using established microphone technology, Multi Microphone Arrays have been perfected for use in a variety of situations with recording engineers often having a preferred array for a given situation based on their sonic aims for the recording project (Moylan, 2007, pp. 261 – 274).
The Soundfield microphone allows users to record the 360° sound scene which surrounds it using a single contained unit (Craven & Gerzon, 1977). The microphone produces four signals and these can be processed in a certain way to create the surround sound image. The processing required to create the component signals of a stereo or surround sound production can take place within the engineer’s digital audio workstation at any point after the recording takes place (Harpex, 2014; Soundfield Ltd., 2014).
This allows the recording engineer to revisit the recorded material and process it a different way to achieve the best sonic results for a production in return for minimal setup effort. This is in contrast to the engineer in a Multi Microphone Array context having to ensure that their array has been setup perfectly with respect to the angles and spacing between the individual microphones as errors cannot be fixed in the mix.
The Soundfield MKV system (Soundfield Ltd., 2013) costs in the region of £5750 incl. VAT at the time of writing (HHB Communications Ltd, 2014) while a set of professional quality microphones such as the AKG C414 XLII can cost around £4795 incl. VAT for a set of five required for surround sound recording (DV247, 2014). Although many may find the price difference to be significant, recording engineers may also experience practical and value for money differences between the techniques which could impact on their ultimate choice when purchasing a system.
The set of the five AKG C414 microphones, or similar, can be redeployed across multiple instruments in a variety of studio and live recording scenarios such as close microphone techniques being used on individual instruments. The Soundfield system would not be a capable of this as it a single contained unit. The versatility of the multi microphone approach would greatly benefit recording engineers in terms of available recording options and value for money unless there were clear sonic advantages of using the Soundfield system with respect to audio production quality.
This project sets out to compare the two recording options in order to establish whether the Soundfield system is capable of producing the clear sonic advantages required for it to be recommended to engineers. The remainder of this section will describe the background of microphone operation in order to make an initial comparison between the recording methods. A literature review in Section 2 will establish the requirement for research into this area and highlight the elements required to construct a robust production and testing methodology to investigate these initial comparisons. Section 0 will describe the production of the test material and Section 4 will describe the testing phase of the project. The methods of statistical analysis will be discussed in Section 5 with the results, discussion, conclusions and avenues for further work discussed in Sections 6, 7, 8 and 9 respectively.
Polar patterns are the basis of microphone sound pickup. Some of these are detailed in Figure 1.2. Each pattern has its own pick up characteristics when used in music recording. An omni directional microphone will pick up sound equally in all directions meaning the direct sound from a sound source will be picked up as well as the reflections it creates in a space. Cardioid microphones will pick up the direct sound from the source while rejecting a proportion of the reflected sound which results in better intelligibility of the direct sound. If the cardioid microphone is not picking up enough ambient sound for the production, a figure-of-8 pattern can provide a better ratio of direct to ambient sound (Eargle, 2005, pp. 7 – 21).
Advances in technology have allowed for improvements of microphone characteristics such as signal to noise ratio and improved frequency response (Eargle, 2005, pp. 1 – 6). Despite the technological advances, the basic concept of operation has remained the same meaning that great detail is required to ensure the recording array has been setup perfectly with respect to the geometry and interplay between microphones as there will not be any avenue for fixing mistakes in these areas during editing or mixing.
Recording arrays for classical music are designed to make use of the human hearing system’s abilities to locate a sound source’s direction; in other words, how the brain determines sound source’s directional cues. Figure 1.3 shows the paths of direct sound between a sound source and the ears of a listener. By processing the difference in arrival time of the sound source between each ear, known as the interaural time differences, the brain can determine the source’s directional cues (Howard & Angus, 2009, pp. 107 – 111).
Additionally, the signal coming into the right ear is being attenuated by the physical presence of the listener’s head. This creates a sound intensity difference between what the left and right ears receive which can be processed by the brain to determine the sound source’s directional cues. This is called the interaural intensity difference (Howard & Angus, 2009, pp. 107 – 113).
Each interaural difference operates at a different area of the human hearing spectrum. Interaural time differences operate at frequencies below 700Hz and interaural intensity differences operate at frequencies above 2.8kHz while both are at work at the cross over region from 700Hz to 2.8kHz (Howard & Angus, 2009, pp. 112 – 113).
All recording arrays have been designed to use one or both of these interaural differences as a basis of how they capture the sound image as they have an impact on the characteristics of the image when played back on a standard stereo or surround sound system (Eargle, 2005, pp. 168, 174 – 175). These characteristics are considered by recording engineers in their array choice to meet the aims of a given production.
Multi microphone techniques employ a number of discrete microphones to capture stereo and surround sound images. Sections 1.2.1 and 1.2.2 outline examples of a stereo and surround sound Multi Microphone Array respectively for initial comparison with the Soundfield microphone which is detailed in Section 1.3.
The ORTF stereo technique, detailed in Figure 1.4, was developed by the Office de Radio Television Diffusion Française and uses two cardioid microphones spaced a specified distance and angle from each other (Eargle, 2005, pp. 179 – 181). The microphone spacing used in this array has similarities with the distance between human ears which in general is a distance of around 18 centimetres (Howard & Angus, 2009, p. 107).
The Sound Performance Lab array uses five microphones to capture the surround field. The left, centre, right, left surround and right surround signals are routed to the left, centre, right, left surround and right surround speakers of a 5.1 loudspeaker setup respectively (ITU, BS.775-3, 2012, p. 2). The array, shown in Figure 1.5, can be seen as having two distinct sections with the front three microphones being a significant distance ahead of the rear surround microphones. In terms of recording, this is perhaps the single biggest difference which can be observed between Multi Microphone Arrays and the Soundfield microphone, although front to rear distances can vary depending on the Multi Microphone Array chosen.
Alan Blumlein developed the concept of creating what is now known as the Mid/Side recording technique whereby a stereo image can be recorded by using a figure-of-8 microphone in combination with a cardioid microphone (Eargle, 2005, pp. 173 – 174). This technique splits the stereo image into three pieces which make up the middle and side components of the image.
The figure-of-8 microphone is placed in parallel to the width axis of the performers. This means that the front and rear polar pattern lobes of the microphone can feed the left and right signals in the mix when routed through a matrix. The cardioid microphone is placed on axis to the centre point of the ensemble. These signals can then be manipulated with respect to level at the mixing stage to influence the characteristics of the stereo image with the aim of improving it and/or solving issues in the recording which could not have been otherwise addressed (Blumlein, 1931, p. 91).
The setup and routing of the Mid/Side technique is shown in Figure 1.6. This technique can be implemented using existing technology with no need for specialist equipment and has become a popular method of stereo recording.
The Soundfield concept advanced on the mid/side technique. The Soundfield system uses four unidirectional microphone capsules arranged in parallel onto each side of a tetrahedral shape, which are labelled 12A, 12B, 12C and 12D in Figure 1.7. The four signals from the capsules are collectively known as the A-format.
The A-format is then processed to produce the B-format signals using a Soundfield decoder (Soundfield Ltd., 2013). This process uses signals from each capsule and passes them through mathematical and frequency equalisation processes. The output of these processes is called the B-format. Equation 1.1 details the A-format to B-format conversion (Gerzon, 1975).
Where A, B, C and D are the tetrahedral signals of the A-format.
After the mathematical operation outlined in Equation 1.1 is complete, the signals are sent through an equalisation process (Craven & Gerzon, 1977, pp. 4 – 5).
The B-format is made up of an omnidirectional component called the W signal and three figure-of-8 components known as the X, Y and Z signals which correspond to front to rear, left to right and height axes respectively, as seen in Figure 1.8. W, X, Y and Z are the post-equalisation signals of E, F, G and H respectively (Gerzon, 1975, pp. 4, 5). With the B-format signals recorded and by using a suitable decoder, a set of polar patterns can then be created to conform to the desired output standard.
This section will highlight the main differences which can be observed between Multi Microphone Arrays and Soundfield recording techniques in preparation for the Literature Review in Section 2 which will outline a method of investigating the significance of these differences.
Figure 1.9 shows a set of Soundfield capsules. There is a very small physical distance between them. When a recording array uses small distances between the microphone capsules, the array is referred to as being coincident. As the arrival time of a sound would be close to identical in terms of directional pickup, a coincident array will utilise the interaural intensity differences of a sound source (Eargle, 2005, p. 168).
A spaced array, such as the ORTF array outlined in Section 1.2.1, uses both the interaural time differences and interaural intensity differences for the determination of directional cues (Eargle, 2005, pp. 174 – 175).
Both coincident and spaced stereo techniques have inherent stereo image characteristics. For example, coincident techniques generally exhibit strong localisation and image sharpness where spaced techniques give a softer and less defined image (Eargle, 2005, p. 175).
The most distinct difference between the Soundfield and multi microphone techniques is the spacing between the front and rear sound pickup. The significance of this can be understood by looking at the concept of the critical distance, shown in Figure 1.10. This distance is defined by the point at which the intensity level of a direct source is equal to that of the reverberant field it creates (Pohlmann & Everest, 2001, p. 37).
Placement of the front microphones too far away from the musical sources can allow too much ambient sound into the recording and produce a narrow front image width, as seen in Figure 1.11. If the placement is too close, similar to the blue array in Figure 1.11, an unnaturally wide stereo image with too much direct sound can result (Eargle, 2005, p. 245). If the placement is too far away, similar to the red array, the image can have a squashed or narrow characteristic while allowing large proportion of ambient sound into the image.
The ideal placement will place musical sources across the image in a balanced and pleasing way (Moylan, 2007, p. 294). For example, the orange microphone placement in Figure 1.11 may provide the most pleasing image with a trio of musicians however if are more than three or four sound sources, the image may become too full resulting in poor clarity and localisation which would require a closer placement and the possible use of ambient accent microphones. These are important considerations when placing a stereo or front section of a surround sound array, of any type.
The purpose of the rear speakers of a surround sound system is to convey a sense of ambience rather than the reproduction of musical instruments which the front does (Howard & Angus, 2009, pp. 367 – 368). With the front array placement chosen as described earlier in this section, the rear microphones in a multi microphone context will invariably be placed a distance away from the front microphones and further from the sound sources to allow for the pickup of sufficient reverberation from the space for use in the mix (Eargle, 2005, p. 245; Wuttke, 2005, p. 6). This means that a consequence of using the Soundfield system may be problematic quality of ambient sound for the rear sections of a production.
By spacing the left and right microphones of a spaced stereo recording array too far apart, the correlation between the signals they record will drop. In other words, the proportion of direct sound which they both pick up will drop to such a point that a distinct gap between sound sources can be perceived between the left and right loudspeakers (Eargle, 2005, p. 176).
In the front image, the drop in correlation has a similar result on the image width as placing the microphones too close to the sound sources and is not desirable as there would be no cohesive or balanced front image. If the rear image is highly correlated with the front image in classical music production, the listener may perceive musical sources which were in front of the recording array in the rear of the playback image which is also not desirable (Howard & Angus, 2009, p. 368) (Eargle, 2005, p. 245). The fully coincident nature of the Soundfield system may result in problems with front rear correlation upon playback.
A relative advantage of the Soundfield system over a Multi Microphone Array technique is that a single stand and set of cabling are required for a full surround recording. This assists in setup time as there is no need to carefully measure the distance and angle relationship which would otherwise be required for a Multi Microphone Array technique. This can help towards making the recording array less obtrusive to audience members if recording is taking place live while also easier to fix if the array is disturbed.
Audio production for television, cinema and music rely on the principle of hyper real sound (Holman, 2010, p. xviii; Fazenda, 2012). This concept is where sounds are created or manipulated with respect to listener enjoyment rather than realism (Moylan, 2007, p. 263). Examples of this concept can be seen in music production where the recording of instruments in an acoustically dry room is supplemented with artificial reverb during the mixing stage.
Similarly, the recording of instruments in a reverberant space can be supplemented by using dedicated room reverberation microphone signals in the mix which are adjusted in level until suitable (Eargle, 2005, pp. 194 – 195). Neither of these production methods stay true to the perception of a listener if he or she were sat in the performance space; however, these methods are aimed to improve listener enjoyment.
The operational principles of Multi Microphone Arrays and the Soundfield system have been outlined. The differences between these methods have been highlighted with respect to their influence on recorded material. With these aspects considered, it is the aim of this study to compare a Multi Microphone Array with the Soundfield recording system in the recording of classical music in an effort to highlight if and how the differences translate into the preference of listeners. The results of this can then be used by engineers when considering their options of recording arrays for similar situations.