Where to start learning about audio or video codecs?

Vamsi picture Vamsi · Mar 26, 2010 · Viewed 13.7k times · Source

I am very much confused to know what happens inside the codecs. I want to learn about the elements inside audio encoders and decoders. Would be very happy if you can provide me some links where i can find some good study material.

Precisely I would like to know how the codec parses the a media file.

Answer

Richard Berg picture Richard Berg · Mar 30, 2010

Your title asks about A/V compression, but the rest of your comments talks about parsing the media file & identifying its codec. Those are very different tasks: spec'd & implemented by different organizations, performed by different APIs in most multimedia libraries, and above all requiring very different skill sets.

A/V file formats aren't too different from any other file format, which in turn are just formal grammars. Parsing, validation, and the resulting object graphs are conceptually no different from any other grammar -- and in practice, they tend to be far simpler than the grammars you encounter in a standard CS curriculum (compilers, finite automata). The AVI file format is kind of antiquated at this point, but I'd still recommend starting there because:

  • many of today's more complex formats resemble AVI in whole or in part, or at minimum assume you're familiar with its basic structures
  • AVI is a member of a larger family of multimedia formats known as RIFF, which you'll find used in many other places such as WAVs

Codecs, meanwhile, are some of the most complex algorithms you're likely to find among "consumer" software. They draw heavily on advancements in both the academic community and the R&D arms of large corporations (including their vast patent libraries). To be proficient in codecs you need to know the at least the basics of:

If you have already have a decent background (eg, you've taken one or two undergraduate level "math for engineers"-type of classes) then I say dive right in. Many of the best A/V codecs are open source:

  • x264 (MPEG-4 part 10, aka AVC)
  • LAME (MPEG-1 layer 3, aka mp3)
  • Xvid (MPEG-4 part 2, same as Divx and many others)
  • Vorbis (alternative, patent-free audio codec)
  • Dirac (alternative, patent-free video codec based on a wavelet transform)