Skip to content ↓

MIT research pushes sound standard for personal computers, Internet to new level

CAMBRIDGE, Mass.--Breakthrough audio technology developed at the MIT Media Lab is a key part of the forthcoming MPEG-4 International Standard.

The Media Lab contributions will dramatically boost the performance levels of computer sound, allowing CD-quality stereo music to be played on PCs and transmitted through the average user's modem. As a result, richer, more carefully tailored music, sound, and audio effects can be incorporated into a new range of multimedia content.

MPEG, the Moving Picture Experts Group, is part of the International Standardization Organization, and is chartered with the development of industry standards for the compression, processing, coding and transmission of audio and video. These standards are used worldwide as a common blueprint for the design, development and manufacturing of audio software and hardware components.

The MPEG-4 standard will be released in October 1998 and formally become an international standard in December 1998. Last month, the Final Committee Draft was completed. This milestone indicates that all parts of the specification, including the Media Lab's contributions, will proceed into the final standard. The current draft standard will change little before completion.

"The contributions the Media Lab has made to MPEG-4 are a crucial part of the audio tool set, and represent a fundamental advance in audio standardization," said Leonardo Chiariglione, MPEG convener and chairman.������������������


The Media Lab's new approach to sound processing, called "Structured Audio," represents the first time that sound synthesis methods have been incorporated into an international standard.

Structured Audio is a powerful set of specifications for the description and transmission of sound. While existing audio standards represent sound as a stream of bits, in Structured Audio, content is stored and delivered as a computer program in a flexible language, then translated into sound on the user's computer. Because transmitting data as a program is far more efficient than transmitting streams of bits, this method enables a radical increase in the quality and efficiency with which sound is delivered.

"Structured Audio points the way to a more powerful common platform for sound processing," said Professor Barry Vercoe, head of the Media Lab's Machine Listening group and leader of the Structured Audio research project. "By incorporating these findings into an accepted international standard, we can ensure that musicians, producers, and PC users around the world can benefit from this research."

Until now, sound transmission standards have left content developers and musicians struggling with low-quality "AM radio" sound, limited interactivity, and long download times. Similarly, designers of CD-ROM, game, and multimedia content have been hamstrung by the low-quality, low-functionality sound cards on the customer's desktop. The MIT technology greatly improves the sound quality of multimedia applications, enabling musicians, virtual-reality designers, and game and content developers to create high-quality, interactive synthetic music and sound environments that can be easily transmitted across the Internet.


The performance levels achieved through the MPEG-4 Structured Audio method enable significant new composition and commerce models. Composers of popular music styles such as house music, rave music, techno, and electronica will be able to efficiently sell high-quality compositions directly to listeners via the Internet.

Interactive movies and virtual-reality experiences containing music, sound-effects, and dialogue will likewise be able to envelop the listener in a 3-D world of sound. MPEG-4 also allows the creation of "virtual karaoke" songs, where the music actually slows down and speeds up to follow the singer -- a technology pioneered at the Media Lab.

Structured Audio will also have an impact on the music composition process itself. Composers are free to create new "virtual synthesizers" at will, so their creativity is no longer limited by the capabilities of the fixed hardware synthesizers they own. A composer's PC system incorporating MPEG-4 Structured Audio technology can replace an entire studio of synthesizers, effects processors, and mixing consoles. The standard unifies a growing marketplace in "software synthesizers" which overcome some of these limitations, but until now have been hampered by restricted features, data incompatibility, and a small user base.

The Structured Audio method, developed by researchers in the Media Lab's Machine Listening Group, comprises more than 20 percent of the MPEG-4 Audio standard. This submission, which includes software, technical documentation, and testing methods, was evaluated and verified by MPEG, and found to meet the requirements of the standards body.

The Media Lab's Structured Audio method is designed to integrate seamlessly with the other components of MPEG-4. These include methods for the transmission of speech, recorded music, computer graphics, and compressed digital video. All of these tools may be combined in a single MPEG-4 presentation.

The Media Lab has executed its current standardization work in an open arena, free of patent and copyright restrictions, in order to encourage advances in multimedia for all computer users and technology companies. All of the computer tools developed by the Media Lab in the Structured Audio project have been freely donated to the Internet, and the Media Lab maintains no control or "veto power" over the direction of the standard.

Support for this research was provided by the Digital Life Consortium of the MIT Media Lab. Additional information about MPEG-4 Structured Audio is available on the World Wide Web at

Related Topics

More MIT News