ABX Testing (and a new audio interface)
This is probably about as close to a political post as I am likely to write. I think that listening “tests” that are not conducted as double blind side-by-side comparisons are just wishful thinking. We wish that human hearing were not so totally dominated by the vagaries of our brain/mind, but it is. We wish that we could retain accurate mental images for more than a few seconds, but we can’t. We think we can discount the impact of small volume differences, but we can’t, and the smaller the difference the more likely we are to describe it as anything but a volume difference. We think we can trust our ears but all the evidence gathered from controlled experiments tells us plainly that we should not.
Since my interest in recording began only a few years ago, I’ve always had the internet as a resource for learning about the subject, I researched in every forum and magazine site I could find. And I now firmly believe that most of what I learned there was incorrect.
I should have been on my guard, because years ago in my pursuit of the playback side of audio I learned that uncontrolled listening tests are simply delusion at work, and that people routinely hear remarkable differences where there are none at all. But when I started trying to learn to record I was persuaded that different preamps and different a/d converters would make a night-and-day difference in my recordings. So I upgraded, then I upgraded again. When I started doing careful comparative listening, I realized that I wasn’t hearing these predicted major differences. In fact, I wasn’t hearing any difference at all.
Controlled Listening Tests
Since then I’ve tried to set up carefully controlled tests to compare gear. It’s not easy, at least for me. I seem to often miss some important detail in the setup, creating differences that shouldn’t be there. When I tried to compare three mic preamps I had the high pass filter (a low cut switch, in other words) active on one preamp. And when I tried to compare several field recorders, one recorder was set to record mp3s instead of waves, and once again the high pass filter was on. But I keep trying, and I’m getting a little better, I think.
ABX and foobar2000
ABX testing is a well established method for comparing two audio files (or other sources). A proper ABX test has only two items under test. The listener can take as long as they want, listen to either clip as many times as they want, go back and forth from the unknown X to the known A or B as often as they want. Then they state whether X is A or B. Not which they prefer, but simply which is which. Then the test is repeated for enough trials to achieve statistical validity.
ABX was originally hardware based, complicated, and expensive. But if we limit our testing to existing audio files we can do ABX testing in software. Various programs that implement ABX testing of digital audio files has been around for a number of years. The orginal PCABX.COM site has been allowed to lapse, but some of the introductory material is still available here.
I found a nifty program that makes the ABX process technically very easy. foobar2000 is a terrific freeware audio player that includes an ABX utility.
A New Audio Interface
I’ve been happy with my LynxTwo-C audio interface for a number of years. It has worked reliably, Lynx Studio has kept the drivers up to date and solid. But I’ve done a couple of sessions lately that could have used more inputs and more mic pres. The Lynx card offers some high powered expansion options, but I was also looking for a system that would integrate my monitor and headphone outs. I’ve been using a system that can only be called a kludge, although a successful one.
Meanwhile, the word on the M-Audio Profire 2626 has been good, I found a B-stock unit on Ebay and bought it. I began by installing the Profire on a nearby computer, leaving the Lynx card in my audio system. And with both systems installed, it was clearly time to try to do some carefully controlled listening tests.
Dynamic Mic, Two Preamps, Two A/Ds
Small variations in volume can apparently be recognized, but the listener hears a quality difference rather than a volume difference. Richard Clark has conducted hundreds of blind tests of amplifiers and says that he adjusts volume to .01db accuracy, although most people can’t detect differences of .1db.
In my first test of two preamps into two different converters, in an effort to create files of equal volume, I started each file with a test tone, generated from Adobe Audition and played through the LynxTwo output. I adjusted the John Hardy M-1 and the Profire input gain to create a signal at -18db, measured by eye on each system software mixer. Then I left that gain setting for the musical recording. I had planned to make the final precise adjustment to the gain in Adobe Audition, but to my surprise the software was precise only to .1db. So in spite of my efforts, the samples are at slightly different levels.
In this test I used a dynamic mic, an Electrovoice RE15 connected through a Coleman Audio LS3, basically just a y-connector, to the two preamps. I recorded my solo acoustic guitar about 2 feet (.6 meters) from the mic. This resulted in a very low signal and a tough test for the Profire preamp.
Here are a couple of clips that are easy to tell apart. I had hoped that the preamps on the M-Audio Profire 2626 would replace my faithful John Hardy M-1, but if you listen to the end of these clips you’ll hear a lot more noise in one clip – that’s the Profire.
download 090420Test1A.wav
download 090420Test1B.wav
But what if we trim off the end of the clip. Can you still tell the two recording chains apart?
download 090420Test2A.wav
download 090420Test2B.wav
Condenser Mic, One Preamp, Two A/Ds
I’ve seen quite a few debates about the audibility of a/d converters. Many people posting on the internet state as fact that prosumer level converters can’t compare with high end devices. And many prefer the sound of recordings made at high sample rates, insisting that they sound better even after conversion to the CD standard 44.1/16 format.
This time I used a Rode NT2a into the the John Hardy M-1, then the Coleman LS3 to split the signal to the line inputs of the LynxTwo and Profire 2626. Even with both units at nominal line level (+4dbu) there were small volume differences. Surprisingly, the unit with the longer cable run was louder. So once again we have slightly different volume levels that may make our ABX testing less valid.
download 090420Test3A.wav
download 090420Test3B.wav
How To ABX
Start by ownloading the clips above. Save them in a place you can find, like your music folder or your desktop. Maybe create a folder for this project.
* * Edit – March 17, 2012
When you download foobar2000 you get the basic package here: http://www.foobar2000.org/download. In order to include the ABX comparator in your installation, go to the bottom of the page to Browse official components and follow the link. The ABX comparator utility is the first item offered. At the bottom of this page you’ll find How to install a component? which will take you to a page of instructions for that purpose.
Start foobar2000 and open a pair of the test clips. The clips are named Test1A and Test1B, etc. Select both clips, right click, choose the Utils menu item, and there you’ll find ABX. Here’s a video that demonstrates the use of foobar2000 and its ABX comparator:
I hope some of you will download these samples and foobar2000 and conduct your own test. I’d be interested to hear the results of any ABX tests you conduct. Please contact me through the comments section with your results. I’ll post the keys to the samples in a future update. Let’s say, 2 weeks after this entry. (Mean, huh?)
I also hope you’ll make your own controlled comparisons and do your own ABX testing of preamps, converters, DAWs, cables, and other odds and ends of audio gear. Perhaps we can all learn something.
This entry was posted on Wednesday, April 22nd, 2009 at 7:36 pm and is filed under Audio, Comparisons, Tutorials. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.
Vladimir said in post # 1,
on April 29th, 2009 at 11:45 pm
Dynamic Mic, Two Preamps, Two A/Ds
2A and 2B: samples are quite similar, but still abx’d 100% times correctly
3rd test is of course incorrect, because of volume difference, perhaps it is better to normalize samples in audio editor. Other than volume maybe there is subtle difference in attack or strings details
Fran Guidry said in post # 2,
on April 30th, 2009 at 11:06 am
Vladimir, thanks so much for listening. Can you help me hear the difference between 2a and 2b? I have been unable to ABX them as different. Was there a particular section of the music you listened to?
I checked the 3a and 3b clips and couldn’t hear a level difference. I used Adobe Audition to normalize the files to some approximation of equal loudness. Can you confirm that you heard significant level differences between these two clips?
Thanks for stopping by,
Fran
Homebrewed Music − Question and Answers said in post # 3,
on May 14th, 2009 at 5:04 pm
[…] post will reveal the identity of the comparison clips in the post comparing the M-Audio Profire and the Lynx and John Hardy recording chain. But before providing the answers, I’d like to pose a […]
Doug Young said in post # 4,
on May 18th, 2009 at 8:24 pm
hey Fran, finally got around to trying this. On the Dynamic Mic, Two Preamps, Two A/Ds, I scored a 10% probability, so not incredible, but I do hear a difference, tho to me it’s so close I have to be really focused to hear it. To me B, is a little fuller, and warmer. Whether that’s because it’s the better channel and less harsh, or the worse channel with lesser frequency response is hard to say. Now I’ll go on and look at the answers!
Homebrewed Music − Mic Comparison – a Tutorial said in post # 5,
on June 25th, 2009 at 1:57 pm
[…] foobar2000 audio player offers one solution, with the ABX testing utility built-in, as described in this blog post. This is a powerful tool, because it not only offers a way to test clips double blind, it helps us […]
Howard Barnum said in post # 6,
on December 10th, 2010 at 3:19 pm
Hello—
I’ve been enjoying your site, finding some of the clips useful.
I’ve installed foobar, but the only option I’m given after rightclicking a pair of selected files is “save as playlist”. If I save them as a playlist, load the playlist, and select the files and rightclick…same thing.
Can you help? I’d really like to ABX these (and other things!).
Thanks!
Howard Barnum said in post # 7,
on December 10th, 2010 at 3:46 pm
Hi again—
I resolved the foobar issue. Although I installed with “full” (and reinstalled after the problem to be sure), foobar_abx was not installed. If you google “foobar abx” you get a link to a place on the website where you can download it. When you unzip it you get a file foobar_abx.dll; move this into the “components” directory in the foobar2000 directory in the Program Files directory…. then open foobar and it works as Fran has described.
I got 5 correct guesses out of 5 comparing two files I downloaded from this site… but now I will have to track down what those were, as they aren’t the above .wav files! Will give the above files a try anon.
Ben said in post # 8,
on March 10th, 2011 at 1:56 am
Hi
These results are interesting, and indeed I could pick the hardy due to the noise floor of the m-audio being at prosumer levels. Thanks for doing these comparisons, it is interesting. The guitar playing is great too btw! its nice to have a nice sounding guitar and some good playing instead of the usual comparison a/b clips which don’t often feature good musicianship!
I’ve owned both the Lynx and the 2626. I record more than just acoustic guitars at once and found that as soon as you start layering tracks with the m-audio, the results start to become a lot clearer about what piece of equipment is professional and which is not.
I also think that, like you say elsewhere, the biggest difference anyone can make to their recording chain is in room treatment. However, I think it is important to note that going back in the logical order in the chain to microphone, then preamp, then converter all make differences. When added up, the weakest link in this chain is always apparent in some way. If, for example, you recorded an acoustic guitar track longer and with dynamic changes than the sample you provided with the profire, we would be stuck with the noise. Sure, you can gate some of it off, but you would be eating into the level of detail in the recording. I’m sure if you didn’t close mic the two recordings there would also be a more significant difference as the noise floor and details are often in the subtleties of the room.
If you listen with a high quality pair of headphones (say senn hd600s or similar) directly to audio (say a well recorded track such as Dire Straits ‘Money for Nothing’ or similar) played back through the profire, and a/b it with audio played back through the lynx, then you will hear a big difference in terms of conversion quality. I did. If I still had the profire I’d be interested in doing the blind test and report back. The Lynx D/A is not by any means great as a d/a but it is acceptable. The profire in comparison sounds like it has higher distortion (it measurably does), more noise (it measurably does), more jitter (it measurably does) and it simply sounds more veiled and congested. There are actually measurable and objective ways of describing this that are well established – THD +N specifications, Dynamic Range specifications, Signal to noise ratios. All of these things make a difference and the better the specs, usually the higher the price (no surprise).
Regards
Fran Guidry said in post # 9,
on March 17th, 2012 at 11:46 am
I should have addressed this a long time ago, but let me pick this up anyway.
First, stating that you hear a difference is not the same as demonstrating that you hear a difference. If you ABXed the difference successfully in 13 out of 16 tries, I’ll agree that you can tell them apart. Otherwise, I politely decline to accept your statement as fact.
Your suggestion that all differences in specs result in audible differences simply isn’t borne out by research into human hearing. There are known limits to our ability to recognize changes in distortion, frequency response, and noise. Once those limits are passed, human beings can’t recognize any further change.
And in fact the color of the box or the label on the front or your neighbor’s opinion or a thread on Gearslutz will have a lot more impact on your perception of the sound of a device than a change from .001 THD to .00001 THD. Thus the need for double blind level matched same performance comparisons. Anything else is just opinion.
Fran