A recent online discussion about "32-bit Float” audio reminded me that users are terrible designers.
Background: Most of our audio in current cameras is recorded as 24-bit linear of some sort. That has implications on where the signal level is set and what happens when things get too quiet (noise increases) or too loud (distortion ensues). 32-bit float is a different recording approach that provides a broader signal range that can handle both quiet and loud without user adjustment.
Note the last three words in my background: without user adjustment.
The problem with the 32-bit float discussion is that it quickly descended into technical bits and pieces, and technical descriptions by people who aren’t audio engineers, to boot. Which means that all kinds of bonkers theories and assertions were being made. The majority of the participants all said they didn’t want 32-bit float and didn’t know why any company would pursue it.
Well, the answer to that is easy: solve a user problem.
I’ve been setting audio levels manually for 54 years. It’s a pain in the butt because if you set levels too low, you get audible hiss, but if you set them too high it’s almost guaranteed that someone will get close to the microphone with bagpipes and create absolute garble. (Legal Disclaimer: “bagpipes” is used as an example only. Your audio may be ruined by any number of other loud sounds created by anything alive, electronic, or mechanical. “Garble” means that the loud sound won’t be properly recorded. Warning: loud sounds may have a negative impact on your short- and long-term hearing and should be avoided, if possible.)
What I as a user want to do is set a reasonable signal level and then not have to monitor it constantly for fear of bagpipes or a Seinfeld-esque low talker.
Now, there are engineering issues with using 32-bit float. If you use terrible pre-amps, as most of our cameras do, being able to record that low talker will still produce problems, typically what most of you would call hiss (because the signal-to-noise ratio is poor). But that’s not a problem with 32-bit float, it’s a problem with the hardware components you used ;~). In other words, someone still has to solve an engineering problem, not a user problem, but it’s not the folk who designed the 32-bit float recording system, it’s your camera maker.
Users generally don’t recognize solutions to their problems until they’re pointed out to them. Worse still, many don’t even see a problem in the first place. I’ve had plenty of students proudly show me images they thought were outstanding that were improperly exposed or focused, for instance. When I point out that problem, they immediately want a solution. This is akin to the reason why we got automatic exposure and autofocus systems in the first place. Some recognized the problem on their own, others had it pointed out to them, but once seen, everyone wanted the problem solved!
Had Tascam called their solution AI Auto Audio levels (AiAA [Old McDonald had a farm, A I A A A ;~]), the cheering would have been louder than the jeering. Because everyone knows what the user problem is. But by Tamron trying to instead primarily market a 32-bit float technology, well, that was destined to fail. This is the reason why you shouldn’t have engineers doing the product marketing in your company ;~).”AI Auto Audio is created using a new underlying technology, 32-bit float” is a lot different than “32-bit float has a dynamic range of 1528 decibels.”
Almost immediately I saw people ask why we need a dynamic range of 1528 decibels (a bit like a visual dynamic range of 255 stops). And argue about how accurate such a range could be. Not. The. Point.