Skip to content ↓

MIT research raises international audio standard for computers, Internet

Breakthrough audio technology developed at MIT's Media Laboratory is a key part of the forthcoming MPEG-4 International Standard.

The Media Lab contributions will dramatically boost the performance levels of computer sound, allowing CD-quality stereo music to be played on PCs and transmitted through the average user's modem. As a result, richer, more carefully tailored music, sound and audio effects can be incorporated into a new range of multimedia content.

MPEG, the Moving Picture Experts Group, is part of the International Standardization Organization, which develops industry standards for compressing, processing, coding and transmitting audio and video. These standards are used worldwide as a common blueprint for the design, development and manufacturing of audio software and hardware components.

The MPEG-4 standard will be released in October 1998 and formally become an international standard in December. Last month, the final committee draft was completed. This milestone indicates that all parts of the specification, including the Media Lab's contributions, will proceed into the final standard. The current draft standard will change little before completion.

"The contributions the Media Lab has made to MPEG-4 are a crucial part of the audio tool set, and represent a fundamental advance in audio standardization," said Leonardo Chiariglione, MPEG convener and chairman.


The Media Lab's new approach to sound processing, called Structured Audio, represents the first time that sound synthesis methods have been incorporated into an international standard.

Structured Audio is a powerful set of specifications for the description and transmission of sound. While existing audio standards represent sound as a stream of bits, in Structured Audio, content is stored and delivered as a computer program in a flexible language, then translated into sound on the user's computer. Because transmitting data as a program is far more efficient than transmitting streams of bits, this method enables a radical increase in the quality and efficiency with which sound is delivered.

"Structured Audio points the way to a more powerful common platform for sound processing," said Professor Barry Vercoe, head of the Media Lab's Machine Listening group and leader of the Structured Audio research project. "By incorporating these findings into an accepted international standard, we can ensure that musicians, producers and PC users around the world can benefit from this research."

Until now, sound transmission standards have left content developers and musicians struggling with low-quality "AM radio" sound, limited interactivity and long download times. Similarly, designers of CD-ROM, game and multimedia content have been hamstrung by the low-quality, low-functionality sound cards on the customer's desktop.

The MIT technology greatly improves the sound quality of multimedia applications, enabling musicians, virtual-reality designers, and game and content developers to create high-quality, interactive synthetic music and sound environments that can be easily transmitted across the Internet.


The performance levels achieved through the MPEG-4 Structured Audio method will boost new forms of composition and commerce. Composers of popular music styles such as house music, rave music, techno and electronica will be able to efficiently sell high-quality compositions directly to listeners via the Internet.

Interactive movies and virtual-reality experiences containing music, sound-effects and dialogue will likewise be able to envelop the listener in a 3-D world of sound. MPEG-4 also allows the creation of "virtual karaoke" songs, where the music actually slows down and speeds up to follow the singer -- a technology pioneered at the Media Lab.

Structured Audio will also have an impact on the music composition process itself. Composers are free to create new "virtual synthesizers" at will, so their creativity is no longer limited by the capabilities of the fixed hardware synthesizers they own.

A composer's PC system incorporating MPEG-4 Structured Audio technology can replace an entire studio of synthesizers, effects processors, and mixing consoles. The standard unifies a growing marketplace in "software synthesizers" which overcome some of theselimitations, but until now have been hampered by restricted features, data incompatibility, and a small user base.

The Structured Audio method, developed by researchers in the Machine Listening Group, comprises more than 20 percent of the MPEG-4 Audio standard. This submission, which includes software, technical documentation and testing methods, was evaluated and verified by MPEG and found to meet the requirements of the standards body.

The Media Lab's Structured Audio method is designed to integrate seamlessly with the other components of MPEG-4. These include methods for the transmission of speech, recorded music, computer graphics and compressed digital video. All these tools may be combined in a single MPEG-4 presentation.

The Media Lab has executed its current standardization work in an open arena, free of patent and copyright restrictions, to encourage advances in multimedia for all computer users and technology companies. All the computer tools developed by the Media Lab in the Structured Audio project have been freely donated to the Internet, and the Media Lab maintains no control or "veto power" over the direction of the standard.

Support for this research was provided by the Digital Life Consortium of the Media Lab. Additional information about MPEG-4 Structured Audio is available on the web.

A version of this article appeared in MIT Tech Talk on April 15, 1998.

Related Topics

More MIT News