3 – Audio Production

3.1           Recording

In association with the Music Department and The Sonic Arts Research Centre of the Queen’s University Belfast, the Ulster String Quartet was recorded in the Harty Room of the Music Department on October the 10th, 2013 (The Ulster String Quartet, 2014; Queen’s University Belfast, 2014; The Sonic Arts Research Centre, 2014).

During the three hour recording session, a number of classical pieces were recorded using the Fukada Tree Multi Microphone Array which was adapted for the recording session and the Soundfield array microphone. With the microphone arrays setup in the Harty Room, array signals were routed to the Pro Tools HD equipped Harrison Studio. An image of the studio is shown in Figure 3.1. Monitoring was provided by five Genelec model 1032a monitors (Genelec, 2014). Recordings were made at a sample rate of 48kHz at a bit depth of 24. These settings were retained throughout the testing process.


Figure 3.1 – The Queen’s University Harrison Studio

3.2           Harty Room Specification

The Harty Room is part of the School at Music of Queen’s University Belfast. It is primarily used for classes, concerts and recording sessions from students of the Sonic Arts Research Centre. The room has an estimated volume of 1150m3 and reverberation time of 1.4 seconds (Kuster, 2008, p. 983). As time available in the Harty Room was limited, the critical distance of the room could not be measured and was instead approximated as being 1.64 metres which was calculated using the formula detailed in Equation 3.1 (Sengspiel, 2014).


where is the room volume in m3

and is the reverberation time of the room in seconds

Equation 3.1 – Sabine Critical Distance Approximation

The room is in the shape of a cross, which can be seen in Figure 3.2. With the stage at one end of the room (image left), the rear microphones of a surround array have been known to pick up early reflection information from the alcoves, at the centre of the cross. A more sonically pleasing reverberation sound can be captured by pushing the rear section of a surround array slightly further back. This issue was observed during the quartet setup and the required adjustments were implemented for this project.


Figure 3.2 – Aerial shot of the Harty Room (Google, 2014)

3.3           Choice of Recording Arrays

Based on its performance in past research and considering the performance environment, a variant on the Fukada Tree was selected for the Multi Microphone Array recording. The Fukada Tree replaces the Neumann M50 microphones outlined in the original Decca Tree design with cardioid microphones. The M50, shown in Figure 3.3, is an omni-directional microphone; however, it shows directional characteristics higher up its frequency response (Eargle, 2005, p. 183). The use of a cardioid based array allows more directionality and control of ambience during the production.


Figure 3.3 – Neumann M50 microphone internals (Recording Hacks, 2014)

With reference to Figure 3.4, the left and right microphones of the Fukada Tree are angled slightly away from stage to reach a recording angle of ~108°. Fukada also states additional microphones to the left and right of the array, which are separate to the L, C, R microphones, should be used to improve low frequency response to better capture a large ensemble’s width (Fukada A. , 2001).


Figure 3.4 – Basic Fukada Tree Layout

Where the array could have been set up exactly to the guidelines, each array should be adjusted to suit the recording environment and situation to ensure the recorded sound is at its best (Moylan, 2007, p. 294). A consideration of working in the Harty Room is that the building’s central heating system causes issues with the very low frequency elements of recordings. In addition to this, a string quartet would not require the reinforcement that they provide therefore these microphones would be omitted from the recording array.

The recording angle would also be adjusted to suit the width of the stage and musicians better. The exact angle was determined after the quartet had set up on stage and is detailed in Section 3.4.

3.4           Microphone Setup Details


Figure 3.5 – Microphone Setup

With reference to Figure 3.5, L, C, R, LS and RS are the left, centre, right, left surround and right surround Multi Microphone Array microphones respectively. S is the Soundfield microphone. The front to rear distance of 220cm is not to scale in this figure.

Microphone C of the Multi Microphone Array was placed 175cm from the front of the performers. Microphone S was placed 225cm from the front of the performers which is 50cm behind Microphone C. These placement decisions were made after listening to the quartet warm up where a balanced sound was observed with respect to direct sound and ambience. The placement is 11cm beyond the critical distance of the space which was calculated in Section 3.2.

Microphones L, C and R were 190 cm high, taking into account the stage height. Microphone S was 210cm high to allow it to clear the microphone C. Microphones LS and RS were both 200 cm high.

The ensemble was 213cm wide and 152cm deep. The centre point of the ensemble’s width was placed on the centre line of the Harty room. This line then denoted the position of the centre and Soundfield microphones. The Soundfield microphone was placed between the Left and Right Multi Microphone Array microphones and directly behind the centre microphone.

3.5           Editing and Processing

The editing and processing of the recordings was completed using Steinberg’s Cubase 6.5 in the University of Salford Newton building studios A and D. Cubase was installed on a Dell E6530 and interfaced into the studios monitoring systems via a Focusrite Saffire Pro 24 audio interface. Genelec and Blue Sky surround sound monitoring systems are featured in studios A and D respectively.

3.5.1       B-Format Processing

The Soundfield B-format decoders were not available for this project. Instead, the Harpex-B VST plugin was used for the decoding of B-format to a 5.1 surround sound signal (Harpex, 2014). ­­This plugin accepts the W, X, Y and Z signals of the B-format signal and then processes them to provide outputs such as binaural, stereo, surround and ambisonic. Listening tests show that it the use of the Harpex method provides high quality surround sound decoding of B-format signals (Berge & Barret, 2010b).

After the recording took place and by using the options provided in the Harpex-B plugin a surround sound signal conforming to the ITU-R BS.775-3 speaker angles of a 5.1 loudspeaker setup was created (ITU, BS.775-3, 2012, p. 4). This was used as it is the default output configuration supplied to an engineer to use during a production.

The selling point of the Soundfield array is the ability for post-recording manipulation the recorded signals so a second derivation was created. By adjusting the rear angles of the Harpex-B plugin to 150°, a more pleasing reverberation signal was obtained.

For use in this project, the 5.1 surround function was used. Harpex-B uses an omni-directional signal as the default source for the LFE (Harpex, 2011, p. 9). For the reasons that low frequency reinforcement of the Fukada Tree was omitted from the Multi Microphone Array, the LFE signal was not used.

The plugin automatically applies phase shifts to simulate a microphone spacing of 17cm, similar to the approximate distance between the human ears. This default setting was retained for use in the tests (Harpex, 2011, p. 9).      Basis of Operation

Harpex-B uses a technique called parametric decoding to process the Soundfield B-format signals. This technique splits the incoming signal into filter banks. Analysis of the contents of each filter bin can be used to determine directional information. Harpex-B also uses signal phase relationships for this purpose. Without this, sounds which occupy the same set of frequencies of a given filter bin would be determined as coming from the same direction as the louder signal (Harpex, 2014; Berge & Barret, 2010a; Berge & Barett, 2010b).

3.5.2       Recording Extract Processing

Pro Tools HD was used for the recording as it was the digital audio workstation installed in the Harrison Studio. These recordings where then transferred to Cubase 6.5. The processing included:

  • Examining recorded material,
  • Selection and editing of suitable extracts,
  • Setting markers for exporting of the extracts,
  • Exporting of each extract with respect to desired output format,
  • Importing of the sections into a finalisation project,
  • Fades applied to beginning and end of extracts,
  • Amplitude examination and levelling,
  • Final export.

3.5.3       Amplitude Levelling

All recording extracts were individually run through the Steinberg SLM 128 plugin (Tischmeyer, 2012) which provided metering that met the EBU R128 loudness recommendations (European Broadcasting Union, 2011). The key figure being assessed is the integrated LUFS value which measures loudness from the beginning to the end of the programme material. By playing each individual recording extract in its entirety, the overall loudness can be objectively measured. By resetting the meter, the process can be repeated on each other extract and collectively matched.

Adjustments were then made to pre-export fader values with reference to the SLM 128 to ensure the integrated LUFS reading for each set of recordings were the same within a tolerance of ± 1LUFS. This means that amplitude related preference on the part of the test participants would not be a factor as playback material would be the same loudness for each set (Katz, 2007, p. 168). Table 3.1 shows the LUFS readings for arrays in each set of recordings.

  Set 1 Set 2 Set 3 Set 4
Multi Microphone Array -27.8 -18.2 -18.1 -18.9
Soundfield ITU -27.7 -18.0 -18.1 -18.8
Soundfield   Adjusted -27.7 -18.3 -18.1 -18.8

Table 3.1 – LUFS figures for arrays in each musical piece or set.

3.5.4       Extract Length

The ITU-R BS.1116-1 standard states that test material should be between ten and twenty-five seconds in length (ITU, BS.1116-1, 1997, p. 7). Assisted by the use of a professional ensemble, the examination of the recorded material confirmed that a composite take was not required and issues of noticeable edit points during testing could be eliminated. All test recording extracts were twenty seconds in length.

3.6          Summary

A classical string quartet was recorded simultaneously with a Fukada Tree derivative and the Soundfield MKV system. Recordings were edited to produce the listening test stimuli which were taken from two musical pieces of differing style and performance characteristics. In addition to the Fukada Tree test material, two derivations of the Soundfield system were produced. The first uses the default settings derived from the ITU-R BS.775 surround sound standard while the second was adjusted to produce the most sonically pleasing rear image.

Next post

4       Listening Tests