By Noah Mintz, Senior Mastering Engineer, Lacquer Channel Mastering & Railtown Mastering
I attended the AES Immersive conference and here’s my takeaway;
Immersive content cannot be mastered. At least not yet and not how we traditionally view mastering. This is more than evident if you listen to, for example, the ATMOS mix of Ed Sheeran - Overpass Graffiti from his new album. The ATMOS version sounds completely different from the stereo version. Not just the mix but the sounds of the instrumentation. It’s not necessarily worse but like Justin Bieber and Kid Leroy's Stay, the ATMOS version lacks the sonic impact of the stereo version.
If there ever was a case for the importance of mastering, immersive audio for music is making it. We’ve been listening to professionally mastered music for the past 60 years. There is a reason why 90% of all the songs in the top 100 of all genres, save classical, have known professional mastering engineers and studio credits. Mastering is important for the enjoyment of music. Of course, I’m biased on this but by listening to the immersive and stereo versions back-to-back, this becomes self-evident.
Old white guys are into immersive audio technology. I see the irony of me writing this as I’m an old white guy, it would just have been nice to see AES including some more people of colour and women in the conference. I want to hear those voices. If this technology is to be adopted widely by audio engineers and ultimately consumers it can’t just be propagated by the same oldboy nerdy audio engineer club.
Once you hear immersive audio in a proper surround studio you’re a convert. I’m guilty of this. I’ve had the pleasure of listening to music in a few 7.1.4 (12 surround speakers) ATMOS studios and it’s incredible. Truly immersive. Once you’ve heard this or worked in that environment you tend to forget about how limited the binaural experience is which is the way most people will hear immersive. I know we’re all monitoring in binaural but the decisions are happening in a surround speaker environment and frankly, they are just not translating in binaural. Also, I think binaural listening presents only a small fraction of the immersive experience. Even with the Sony and new Apple headphones, which are specifically made for spatial sound, they don’t capture anything near the experience of a proper surround speaker set-up. More people need to be able to hear, experience, and own a proper surround speaker array for this format to become viable. That, or a major change in immersive headphone technology that doesn’t just rely on a virtual/perceptual experience.
Interactive music is coming. Soon listeners will be able to create their own mixes of music. This, at first, seems like a nightmare but I actually think this is pretty cool. The artist and engineers get to decide how much they can manipulate the mix at the authoring stage. So if U2 decided that you can’t completely take The Edge’s guitar out of the mix, you won’t be able to but this will provide a whole new way to listen to music that I think will actually enhance the listening experience. Imagine listening to a jazz piece and being able to listen to it again but only the trumpet. To be able to hear the instrument in all its naked glory. That’s exciting to me and any instrumentation fan. Again, we have the potential problem of it not being properly mastered but this I believe will be figured out at some point.
There is no single immersive format. Currently, the biggest formats are ATMOS and Sony 360. Sony just might be the superior format but, like BETA vs VHS, it looks like ATMOS is winning out. There is also MPEG-H to add to the mix which seems super promising and inexpensive. There are also multiple DAW surround panning software platforms, each with its own protocols. Usually, variety is good but there are going to have to be some standardizations if immersive is going to be adopted by consumers. Even with DVD-A and SACD they eventually came out with players that did both. So steaming platforms are either going to have to adopt all formats or there is going to have to be some agreement between the technology companies to output in a generic 100% compatible format that works with all platforms for this to survive.
I haven’t been into or interested in this latest round of surround music for very long. When it reared its head a few years ago I had visions of 5.1 from the late 90s and how I almost invested $100,000 in a surround mastering setup only to, thankfully, nix the idea in the 11th hour. 5.1 surround as a music format all but failed. I loved 5.1 but it was better suited for film and television. 5.1 mixes offered not much more beyond a higher resolution and some spatial content but at least they were mastered. Some of those releases are still incredible and if you ever get a chance to hear a proper 5.1 SACD or DVD-A in a well set-up living room I highly suggest it. Joni Mitchell’s year 2000 release Both Sides Now is an excellent example.
This latest round of surround is very different. Unlike 5.1 it’s fully scalable to over 100 speakers. It’s object-based so it’s not the channel that’s sent to a speaker or two, it’s an instrument, vocal, or sound that exists in 3D space and the speakers follow it no matter how many speakers there are. It has the potential to create new ways of mixing, new ways of creating music. It puts the power of surround in the creator's hands especially with the new version of Apple Logic Pro natively including an ATMOS panner and renderer and MPEG-H having free tools. The biggest problem and biggest challenge is playback. Binaural is not currently an acceptable playback solution especially since a proper surround set-up sounds infinitely better. Binaural, to me, doesn’t offer a huge improvement over stereo. In fact, with the lack of mastering immersive music being an issue, I’d argue that there is no improvement at all, quite the opposite. I’m excited to see where this technology will go and I’ll be participating in its creation but until it’s thought about from an average consumer's improved experience over stereo, it’ll just be a niche format doomed to follow in 5.1’s dried up footsteps.