Future History

This column has traditionally been about the history of surround sound, but what about its future? If we were to be around in 50 years, *NSync would be old men, and who knows what the economy would be like? With human lifetimes extended routinely, more free time should be available for such pursuits as music and movies. Thus the future of surround sound is bright since it is such an engaging experience, and people should have more free time to engage in it, but just what will it be like?

In the recording field, the idea of capturing a soundfield completely in all its magnificent spatial detail, called “analytically” in the technical literature, might finally be headed for some fruition. As Michael Gerzon pointed out, it would take about a million channels to fully capture one space and reproduce it in another in all its detail. This means that point-by-point measurements or listening in a reproduction space would match perfectly those of a corresponding recording space. Perhaps by reducing the problem through better understanding of psychoacoustics and active measures to track where the listeners are in the room and produce smart responses from the probably myriad transducers needed may begin to solve this problem.

Jim Johnston of Bell Labs gave a paper at AES last September in which he described some of these matters, calling for many channels to get the soundfield right in the vicinity of one listener’s head. From the abstract: “The ultimate form of this is, of course, binaural recording, where an actual head model is used to capture the information for one head location. Beyond 2-channel presentation, one can think of analytically capturing an original soundfield to some degree of accuracy. This would require the use of many channels, perhaps, in the simplest form, placed in a sphere about the listener’s head, requiring very high data rates (1000 to 10 000 channels, perhaps) and creating a very high probability of influencing the soundfield in the space with the microphones and the supporting mechanisms. As a result, this technique is currently infeasible, and is likely to remain infeasible, for basic physical reasons as well as data-rate reasons, and actual analytic capture of the spatial aspects of a soundfield in this fashion is unlikely.”

In 50 years, perhaps those data rate reasons should be solved, and perhaps microphone technology may have advanced enough so that the needed array is transparent enough to not disturb the field. On the other hand, we’re awfully close to the theoretical noise floor of microphones now, given their small size requirement to avoid contaminating the soundfield with their presence. So this is the harder problem. Stay tuned….

How about for theoretically reproduced soundfields, that is, ones synthesized to be like real sources in space? Already we see experiments in Europe and Japan and patents in the U.S. that have as their basis the idea of an audio pixel. That is a regular array of transducers on a two-dimensional grid meant to reproduce, say, the expanding bubble of sound from a source through appropriate application to hundreds or thousands of transducers. It can be seen as taking the wall of microphones and loudspeakers theorized by Snow at Bell Labs around 1934 and extending them into another dimension. The current systems of this type are plagued by the fact that audible sound covers 10 octaves, or a wavelength range from 56 feet to 3/4-inch. Making small enough transducers to avoid combing effects at the high frequencies along with enough excursion to handle the low end of the spectrum is currently intractable, but might not be so in 50 years.

More ordinary multichannel recording with spaced microphones, tracking, and overdubbing will be affected, of course, by technical progress, but we are reaching the law of diminishing returns on increases in bandwidth and dynamic range of capture, so increases in the spatial illusion through the use of more channels seems certain. Another thing we’re reaching is the limit of human perception of understanding of what all those knobs do, and future console designs will increasingly respond in more direct ways to what is desired. For instance, in order to synthesize a sound moving away from you in a reverberant space today, most consoles and systems call for you to manipulate many factors in order to get this to happen: level, EQ, and reverb at least come to mind. While today all of these are automated, it still is complicated (and hard to teach!). Consoles with multiple interlocking controls to produce a desired effect, like the changes that occur when an actor turns her head, come to mind. Plug-ins for such console/editing systems will be increasingly industry specific, such as the one mentioned. Consoles and editing systems in film and television will be increasingly better integrated, with backward and forward documentation of changes.

We’ve talked in editorial columns before about the rapidly dropping price of storage. Warner Bros. recently did a sound department tour that showed off their 4-Terabyte sound library (that’s 4096 GB). This is what I calculated it would take in 1982 when such numbers seemed like jumping over the moon, and, just 18 years later, it’s here. Somehow, it sounds like a long time, but it wasn’t. The price of storage in 50 years should be so low that it will no longer be any kind of issue.

The real innovations will be in how to catalog, find, reuse, not overexpose, etc. the sound. Ben Burtt gave whimsical names to effects, which were impressed upon many other sound editors in Northern California, and I’m sure the same thing is true in Hollywood. This has a limit to what you can keep in your head. How you go about naming and finding effects in the future will be a formidable task. Already, far-sighted companies like Comparisonics use a variety of spectral and other means to locate sounds from a database.

Findsounds.com searches the Web for sounds based on descriptions, and soon you’ll just make a noise with your mouth into your machine, and it will locate similar sounds. Innovations will be in how humans relate to computers, just as much as in their expanding power.

Next time, sound reproduction systems of the future.