Recently, I decided to undertake a small project to turn a spare room into a mixing space. Acoustically, it is definitely a problem space. It is almost the classic worst case scenario of a box room. I decided to use what was at hand to help treat the room, so duvets covered with sheets (for aesthetics) were used in the corner of the room where the office desk would have to be placed, for various reasons.
Dare I say it, everything looks to be very close to finished in terms of the testing interface and crucially, the method of collecting and sifting through the results! This post is going to take you through the testing procedure and processing of results. Hopefully everything will make sense, if not then thankfully I can catch it before the actual testing starts!
The first thing participants will see after the section to input details such as age, will be the test section. Here users can familiarise themselves with the interface. A totally separate surround sound recording will be playable from this section. Once they are comfortable, they can move on to the test proper.
The picture above is what participants will see at the end of the test. There are 12 sections so there is no need to post all that. The important thing is the test reference number, which is the main point behind this blog post. Previously, I talked about my “Randomisatron”. This messy, complex and head melting monster of a subpatch allows me to shuffle the standard order of the test. The randomisation allows my test to meet the ITU BS1116 standard for the playback of test material where test samples must be presented in random order for each test participant.
Since I know the default playback order for each section (A and B, C and D, E and F etc.) I can shuffle the playback order with a code using the Randomisatron (this name may stick, but maybe not for the write up) With the participants answers filled out, I should be able to (un)shuffle the answers back into the original order with reference to that randomising code. This calls for a trip to Excel!
Processing the Results
My friends have always slagged me off because of how much I like using Excel. That is not to say that I spend my evenings pondering some Excel magic. The reason I like Excel is because at the end of this test, I am going to have 5 answers for each of the 12 sections from at least 20 participants. This is not something I want to do on paper. Excel appeals to my lazy side and if I can get rid of a lot of leg work with one swift click of the mouse, I will gladly take it!
Bear in mind that the Excel file is made so I understand it where the test interface has to be much more sleek. Hopefully I can make this make sense for you! The two columns on the left are the default order which the whole test interface has been built around. This info also pops up on the far right as a reference to make things a bit easier. What I have highlighted in yellow is the randomising part of it.
When I get a set of results from a test participant, I will fill in all the answers for questions 1 to 5 all the way down the 12 sections. These answers can only be A or B/1 or 2, the numbers I have in at the moment were just for when I was testing the thing out. With the answers filled in, I will then take a look at the test reference number and type it into the left hand yellow column. The right hand yellow column is a copy of what the default is, 1 through to 12.
Here is that first Excel image again. Imagine there are 12 CDs with the recordings on them. Take a look at the first CD which is “SF1, T1”. The randomising code is telling the Max patch to play CD1 2nd in the queue. It is telling the second CD to play 12th in the queue and the third CD to play 1st.
The first step is to sort the New Question Order column by the Rand. column. This turns the Rand. into 1 to 12 while also moving the New Question Order numbers with it. What this does is match up the numbers in the New Question Order with the answers I filled in at the start. Remember earlier I said that the third CD was being told to play 1st? Take a look at the first number in the orange column. Remember that the first CD is to be played 2nd in the queue? You can see that from the second spot in the orange column too. Finally, you can also see that the 2nd CD was played in the 12th position. Notice how the CD1 and CD2 answers are A, B, C and D in the image. We know that these below to the first two sets, recordings 1 to 4, so they ideally should be at the very top and lined up with their “SF1, T1” counterparts at the far left of the spread sheet.
I do these steps for each participant and when I get all 20+ finished I can then sort all the Question/Answer columns by the Original Order column and now all the answers for each set of recordings are grouped together, ready for me to do some whole other amount of work so I can get statistics and results processed, something I best read up on soon.
Thanks again for reading. It looks like I have made a small trilogy of posts about how to administer a subjective test in terms of playing sounds to meet the ITU standard, collect the answers to the questions and then process them for ease of use later on. Take a look at the top of this post for links to the other posts.
Thanks very much for reading, I hope someone finds it helpful!
With the project moving at a nice pace I wanted to share some information with you about the test design.
Iin the subjective testing stage of my project, I am asking listening test participants to compare two recording extracts and answer some questions on them. I am testing three arrays, a traditional and two variants of a Soundfield recording. All recordings were done simultaneously. The ITU BS1116 standard for “the subjective assessment of small impairments in audio systems including multichannel systems” applies to my project in the most suitable way. There are certain key requirements which are set out by the standard which include:
– Test participants must be able to switch between recordings as they wish with no loss of place or jump in the sound (eg. Skipping a few sections)
– To avoid bias, test material should be randomised (see post 2)
– The transition between each sound extract must be 80ms in length.
– other things (too much to list!)
My problem was, I did not know of any way to do this without me being in the test room with the participant and even then, human error would be inevitable with timing differences between each switch being different, that and a whole host of problems to say the least! The bottom line is that this could be off putting to participants. Added to that, I needed a way to collect information and answers from the participants. So having one thing for that and another to get them to play sound tracks and switch between them would probably be messy. I needed something so instead of blindly researching a way to do it, I made my own!
Super Max! (Supermacs if you’re Irish)
Max/MSP was the program what I used to solve my problem. (now known as Max 6, despite my constant use of Max/MSP in these blogs) It is a program I was introduced to back at Queen’s University. Coding and programming is not my thing but I always found this program fun and logical. When I encountered Reaktor at Salford Uni I was less than impressed after coming from Max/MSP so when I had the idea of using object based programming to sort my problem out, Max was my first port of call. I have a feeling Reaktor could have done it, but not as well as I knew Max could. I have used both programs to complete the same objective in the past so that is what I am basing those remarks on.
What I have designed here is probably very likely an inefficient way of doing things, so if any Max savvy people are reading then don’t say I didn’t warn you! While I get more research done into what is required in subjective testing for audio, the layout and things like will probably change but under the hood I have the exact program I need to meet the fundamental aims of BS1116.
I wanted to play two surround sound files and allow the user to switch between them instantly without losing place, like I mentioned earlier. I also wanted it to record the answers to the test questions. By combining all these things into one interface, it means that I can save the participants answers using a unique file name and load up a new blank patch for the next person.
The GUI features of Max are fantastic. Maybe my colour scheme needs a look at (probably spelling too, but it’s a late night draft!). What the patch allows me to do is play two surround sound files at the same time, select between them in realtime and select my answer.
This is the magic behind it.There are three main sections.The upper right quarter of the image is the playback system which is routed into the audio outputs in the lower middle. The upper left quarter to middle is where users can select what recording they are listening to and also the display which tells each participant this for their own reference. The bottom right quarter is where the participants answers are filled in and tested to make sure that two options or no options were selected by accident.
What the participant will see is what was in the previous pictures. This under the hood view is how they are all tied together. Max is an object based programming language. Each visual object you see has some traditional coding behind it, but the user does not need to worry about it. It allows the less code savvy like myself the ability to get some seriously powerful custom made projects done without worrying about syntax errors and languages. In fact, the only bit of traditional code that I have come across that you could need are if statements. That said, the help files are brilliantly done so help is always at hand.
Take the play and stop buttons for example. The button has some code under the hood that says when pressed, it sends a signal or a bang as it is called in Max. The output of that play button eventually goes into the object called “sfplay” by following the blue path. Sfplay plays soundfiles and there are two of them because I have two surround recordings that I need played at the exact same time. Note that the output (bottom of the boxes) have 6 green/grey lines. Each of those corresponds to a audio channel on my soundcard. Since 5.1 needs 6 channels I asked for an sfplay with 6 outputs.
Once sfplay gets the message “1” it will start playing, and when it gets a “0” it stops! You can see how the blue line from play goes through a grey box with “1” in it. So, what happens here is that the bang comes out of the button, bangs into a 1 message which is then sent into the audio player which starts playing the music! All this took was to create a few objects and tie them together. If I was to do this in a traditional coding language (I very much couldn’t) there would be a lot of code to mess around with. Some people love it, but I am not one of them so I love this method.
Anyway, I just wanted to share this short post inspired by my newly rekindled love affair with Max/MSP. A severe amount of fun was had designing this, thanks to my friend Michael McLoughlin for some assistance. What you saw here was very much a draft, and I am definitely not a master of the program and I am sure there is a better way to do what I want done but it does the job which is the main thing. If all goes well and it is accepted by my project supervisor, I will release the source code for anyone else doing surround sound subjective work. Hopefully, someone reading this may realise that Max/MSP may help them in some way so I hope this post gets you into it!
Keep on rocking!!
Hello, long time no blog!
Today, I want to talk in brief about my final project for my masters which is designed to be a piece of research in the general area of audio production. This is an informal post, meant to be gloss over certain points in an effort to keep things short and to the point in an effort to make the topic as accessible as possible. Once the project is finished, I will be able to make new posts about each part of the project in more detail! If you have any questions, just get in touch through here. The final dissertation can be read here.
Recently, I recorded a choir for surround sound playback. What this means is that when listening back to the recording in a room properly equipped, you will feel inside the room where the music was being performed. The sound will envelop you from all around you, simulating what it would have been like to be in the concert hall which is a very cool experience. In surround sound you have five speakers. Three at the front for the left, centre and right with two are the rear. Check out the below image from Sound On Sound Magazine about the placement of these speakers.
Many ways of recording surround sound have been developed over the years. The most common are what I call traditional arrays which is a catch all term for multi-microphone recording arrays. If you take note of the image above, there would be one microphone for each speaker. In general, the left, centre and right point at the respective areas of the stage and the rears point into the rear corners of the space. What the microphones “listen” to then gets recorded and played through their respective speakers outlined in the picture.
The collection of these microphones is called an array and then can usually subdivided into the front array and rear array as there can be a larger spacing between the front and rear. If you are at a concert and stand at the very front, you will get a great clean sound. If you stand at the back you get a more reverberant sound so an aim of these types of arrays is to capture both as best possible for use in the production. A pleasing recording for the listener can be achieved by placing the front array fairly close to the performers to get a clear sound and then by placing the rear array into something known as the reverberant field.
Here is a photo of a traditional array called INA5 from www.sanken-mic.com. You can see the distances and angles involved, especially between the front and rear.
For the recording engineer, traditional arrays have around five microphones which means there are a lot of cables which are sometimes quite long, a lot of stands which are usually heavy and/or wobbly, a lot of measurements and angles which can be cumbersome to get correct and then possible headaches to worry about when setting everything up, for example, someone walking into or moving the stands.
There is a relatively new microphone called The Soundfield microphone. Roughly speaking, this is 4 microphones in one. This can be placed in a recording environment just like the front section of a traditional array can. The Soundfield microphones pickup can be seen as mainly based on figure of 8 microphone pattern, to keep things simple. A microphones capsule is in the shape of a big coin and listens in certain ways around it. Some microphones listen to just what is in front and on either side of it. Others listen all around it. The figure of 8 listens to the front and back while ignoring the sides. Imagine two tennis balls placed on either side of those big chocolate coins, this gives an impression of what directions the microphone is picking up from. Here is a photo of a microphone capsule from recordinghacks.com
Here is a diagram of a figure of 8 microphone from the Recording Review Forum, think of those tennis balls.
Basically, these types of microphones are listening in a shape on front and behind the microphone while ignoring what is going on at the sides (includes top and bottom). For the sake of example, imagine there are two of you and if you say something and your duplicate says the exact same thing in the exact same way. If the two of you are placed either side of the microphone and you both say something, anyone that is listening to what the microphone is listening to will hear nothing. That is because one side the microphone listens in a positive way and the other negative, which doesn’t mean one side is happy and the other angry, what it means is that everything can be boiled down to numbers.
What this means means is that what one of you saying could gets turned into the number 1 and the other gets turned into a -1. When the microphone combines these things you get 0, or nothing being heard. With that admittedly odd example out of the way, the Soundfield works in a similar enough way. Depending on what way you add and subtract the signal that the microphone creates, you can hear what is happening at any direction. Think of it as a 360 degree security camera. If you are watching something from the left and then move to the right, you use a control to point the camera in that direction. The Soundfield is similar, but for sound and instead of the camera moving around, the mathematics are being changed to adjust the direction of what the microphone is listening to.
Why mention all this?
Well, it is easier to setup that a traditional array. One stand, one main cable and less headaches. More importantly, after the recording is done and you are mixing the recording you can change where the microphone is pointing as you wish. This is because you are combining what was recorded from the microphone, not having to adjust the direction its pointing in on the day of the recording itself as you would with a traditional array. On top of that you can derive the five different directions at the same time which are the signals you need for the 5 speakers in a surround sound setup. With modern technology and the ease of having a powerful computer for audio production, this can be automated which means no more messing with angles and protractors at 6 feet in the air! With a traditional array things need to be set up exactly and if you make a mistake or something happens to the array which you didn’t know about, you can not fix the problems in the mix!
The Soundfield is also fairly expensive compared to the five standard microphones you need for a traditional array. Expensive not only in terms of money but also in versatility as if you are an engineer who does recording work with bands and close micing, the five standard mics can be much more useful to you than the single Soundfield mic.
Here is a photo of one of my recordings with the traditional array of five microhones in red and the single Soundfield in blue. (very well drawn eh?)
What my project is about.
The Soundfield may be much easier to set up and it may be versatile after the recording in terms of changing its settings to fix issue but does it sound as good as the traditional setup? Remember when I said that the rear microphones of a traditional array are placed further back into the reverberant field? You can’t do that with the Soundfield. The Soundfield is one microphone in the sense it is a single enclosed unit. If you move it further back to get more reverb, you make the front more reverberant too and lose the clarity. Compromise! Additionally, the two concepts simply sound different to each other, not necessarily worse than the other but finding out what a sample of listeners think could help in making the decision between what to use or what to buy.
What I want to do is find out what is “better” when asking a sample of listeners. I intend to record a choir and set of classical musicians with a Soundfield microphone and a traditional array simultaneously. Then, I want to play sections of the songs in a subjective listening test where expert and not so expert listeners can sit in a surround sound listening room and decide what is their favourite, without knowing which is which. The result of this, paired comparison test, will hopefully highlight which recording method is the most preferred. That said, they could be equally preferred, showing parity, and that would not be a bad result. All that means is that the engineer can be faced with a choice of what type of sound they want rather than facing one which could have an impact on listener enjoyment.
Thanks for reading, this was intended to be an informal and accessible look into the background of the project. More detailed and precise information can be found in the final dissertation here. If you are new to the concepts I outlined here and are interested, do let me know and I can guide you to more formal information. =)
I am left a bit inspired and overjoyed after watching the Sound City documentary which was written and directed by Dave Grohl.
The documentary follows the Los Angeles based Sound City studios from its birth in the late 60s through to the present day and recent closure. It talks about the various musicians, producers, engineers who have worked and been touched by the studios. The list of albums coming from this studio is amazing. The Kyuss album “Welcome to Sky Valley” and The Queen’s of the Stone Age album “Rated R” take special places in my heart and since I found out that their sound comes from the Sound City attitude and ethic of doing things, I am left motivated in my own decisions regarding music production workflow. I do not want to give too much away but there is one point which I want to talk about in this blog. This point is the attitude and ethic I mentioned.
The most thought provoking point it raises for me is the development and modern use of digital audio equipment compared to the Sound City days. The studio itself revolved around a very special Neve mixing console.
Having no real knowledge of the Sound City story before this documentary, I sighed and rolled my eyes when Pro Tools was mentioned. Happily, it went on to criticising the DAW. I am not going to start slagging Pro Tools off because that is not what I mean. This line was one of the first indications to the theme of the documentary. This theme was that live, emotional and powerful music is what the aim of production should be and the digital revolution can and had some very negative impacts on this due to misuse. It outlined what digital audio workstations as a whole have allowed anyone to do and allowed cheaply. This could be a factor to production issues that we have had to deal with over the years, such as the loudness wars and overly produces, polished and edited music.
This is something I have believed for quite a while. It has never been easier to do some tracking and then edit or generally manipulate the material until it is deemed perfect. In essence what we are doing is opening the floodgates for the pushing of this concept to an extreme.
Newbies, through no fault of their own, can be sucked into The “fix it in the mix” mindset. Word processing and laptops do this too; I can now say to people that I am a writer. I am a blogger; I write important opinion pieces and post them on a fantastic server. Well, I am not that great. I don’t have perfect grammar and I am sure the spell check has made some confusing corrections in this regard for me. That and I am not for one moment going to tell anyone I am an authority on anything! Where I think my blog is decent, on the other extreme you can have utterly horrendous blogs which are just a means for someone to massage their ego or simply write about things badly. With fancy themes, they can be seen to be great.
To take what I am saying and bring it into an audio context let’s look at this in a digital context. Essentially, it has never been easier for people to write what they feel and post it to an audience which could possibly be millions of people. Digital has also revolutionised the audio industry which allows anyone to become a music producer. This dilutes the first principles which many people have worked extremely hard to work along and with. This dilution is in some ways something which every single one of us do. We all have phases of not knowing what we are doing so we will always produce something which is simply not done in the way it should have been.
In the audio world, vocals can be auto tuned heavily. Bass and drum takes can be edited to the point of technical perfection but all these examples result in musically deficient music. What this Sound City documentary told me was not to let the possibilities of digital to pull me into a place where I mix and produce this music without the music in mind. Record live, overdub what you really need to and keep the energy intact.
A lot of people say click tracks kill the musicality of a piece. I would argue that recording music track by track does this and as it happens, clicks are used a lot in that and get the blame. Something which I have done a lot is use the least number of tracks I can. What I would love to do in the ideal recording setup is live, live and live with minimal and only necessary overdubs. Id keep the editing to reasonable amount and learn when enough is enough and tell the musician we need to get it recorded better. Okay, maybe that is just so my job is easier but I cant help but think that a song which could sound amazing spread across 24 tracks would sound much better than a 40 track traffic jam.
Does it have to be Analogue?
The documentary talks about how the Neve mixing board contributes to the music. Where I felt that sometimes I was being told that the mixer was the reason I ended up appreciating that it is the analogue way of recording that stitches musicians together. This is not because it was recorded on tape, it is not because that particular mixing board was being used as such. For me I was being told that it was because of the limitations of a fully analogue system where we ended up recording bands live, overdubbing only when we really needed to and all I can say is that the music which we could hear being recorded in this way in the documentary is just fantastic. It may not float everyone’s boat but as an engineer I can feel the feel. I can sense the energy, the spontaneity, the music. What needs to be realised is that this production is not about DAW bashing and is not an anti technology group of people whining about the digital domain. What it is about is an attempt to get the audience appreciate the methodologies that the tape medium allowed or probably forced upon us. Even though it has been made extremely easy thanks to digital we should not to let these methodologies go.
So, for anyone reading please don’t get too caught up in the possibilities of digital. It is only going to offer us a consistently increasing number of possible track counts and plug in instances. This documentary shows us all how it used to be done and I think what we have to face is that maybe we reached the peak and musically best way to record and mix just as the digital revolution began. Maybe all the digital revolution has done is allow us build on and streamline aspects of the old way of doing things but at the expense of allowing more misuse and technique abuse in.
The title of this blog post is back to basics which, I hope you can appreciate actually are not basics at all. They are extremely complex recording and mixing techniques driven by experience and genius. They are techniques developed over years of work which demanded certain technological improvements which digital as delivered on.
Who are we to ask for unlimited track counts and millions of plugin instances with surgical editing capabilities when we are clueless about recording and mixing in a fashion which has produced some of the best sounding music ever? Who knows, maybe fighting the noise floor was a much more significant and positive development than any increase in track count or auto tuner could ever imagine to be.
Where does this change our development focus? Well, readers, where do you think?
Thanks for reading!