Vivian Voss

Technical Beauty: FFmpeg

unix tooling architecture

Technical Beauty ■ Episode 12

In 2000, the multimedia landscape was a balkanised mess of proprietary formats, incompatible codecs, and expensive licensing agreements. Playing a video required three different players and a subscription to at least one of them. Converting between formats required a commercial tool that cost more than the hardware it ran on. The industry had created a problem and was charging admission to the workaround.

Fabrice Bellard looked at this arrangement and built FFmpeg. One tool. Every format. Twenty-five years later, it handles virtually every audio and video codec ever created, H.264, HEVC, VP9, AV1, ProRes, DNxHD, FLAC, AAC, Opus, more than a hundred in total. It powers YouTube, Netflix, VLC, Spotify, Chrome, and Firefox. It runs on the Mars Perseverance rover. It is, by any reasonable measure, the most important piece of multimedia software ever written. Everything else is a wrapper.

The Engineer

Bellard is not merely productive. He is anomalous. His project list reads like a computer science department with a single employee. He created QEMU (2003), the processor emulator that became the foundation for KVM and modern cloud virtualisation. He wrote TinyCC, a C compiler in 100 KB that boots Linux in fifteen seconds. He built JSLinux, a complete Linux system running in the browser via JavaScript, years before WebAssembly made such things fashionable. He devised Bellard’s formula for computing digits of pi and then used a desktop PC to calculate 2.7 trillion of them, briefly holding the world record.

Each project follows the same pattern: identify a fundamental problem, build the smallest correct solution, then move on. Bellard does not maintain most of his projects long-term. He builds the foundation and leaves. FFmpeg has been led by Michael Niedermayer since 2004. The foundation, as it happens, has held.

The Architecture

FFmpeg is not a monolith despite its size. It is a set of libraries, each responsible for one concern:

  • libavcodec — the codec library, encoding and decoding more than a hundred formats
  • libavformat — container handling: MP4, MKV, AVI, WebM, and every other envelope that carries audio and video
  • libavfilter — the filter pipeline: scaling, cropping, colour correction, compositing
  • libswscale — pixel format conversion, the unglamorous but essential work of translating between colour spaces
The Universal Translator AVI MKV MOV WebM FLV TS 100+ formats libavformat demux libavcodec decode libavfilter process libavcodec encode libavformat mux MP4 or any format ffmpeg -i input.avi output.mp4 One command. No GUI. No subscription. No cloud account. 1.5 million lines of C. Powers YouTube, Netflix, VLC, Spotify. Everything else is a wrapper.

The design is clean despite the scale. Each library has a defined responsibility. Codecs do not know about containers. Containers do not know about filters. The pipeline is composable: demux, decode, filter, encode, mux. Any format in, any format out. The complexity of a hundred codecs and dozens of container formats is managed through abstraction, not through configuration.

One command converts anything to anything:

ffmpeg -i input.avi output.mp4

That is the complete invocation. FFmpeg infers the input format from the file. It infers the output format from the extension. It selects sensible defaults for bitrate, sample rate, and encoding parameters. No configuration file. No daemon. No account. No vendor. The man page is 7,000 lines. Most users need three flags.

The Numbers

1.5 million lines of C. Twenty-five years in production. The Coverity scan defect density is thirty times lower than the open-source average for projects of this size. That number deserves a moment of silence. A codebase of 1.5 million lines that handles over a hundred codec specifications, each with its own edge cases, bit-packing conventions, and patent histories, and maintains defect rates that most teams cannot achieve at one per cent of the scale.

It runs on the Mars Perseverance rover. When NASA needed to process video on another planet, they did not build a custom multimedia stack. They used FFmpeg. The industry’s universal translator works on two planets and counting.

The Contrast

Adobe Media Encoder wants a monthly fee. HandBrake wraps FFmpeg. VLC wraps FFmpeg. YouTube’s transcoding pipeline wraps FFmpeg. Netflix’s encoding infrastructure wraps FFmpeg. Spotify wraps FFmpeg. Chrome and Firefox ship FFmpeg’s decoders. The industry built graphical interfaces, cloud services, and subscription products around a command-line tool that a French mathematician wrote because he found the existing solutions architecturally insufficient.

The economics are stark. Adobe Creative Cloud costs $660 per year. FFmpeg costs nothing. Adobe supports a dozen formats well. FFmpeg supports a hundred formats correctly. The commercial product is a subset of the free one, sold at a premium, with a graphical interface that most professional workflows bypass in favour of scripted FFmpeg commands anyway.

The Reduction

What makes FFmpeg technically beautiful is not the exhaustive format support, though that is remarkable. It is the architectural decision to treat every codec and every container as an instance of the same abstraction. A codec is a thing that encodes and decodes. A container is a thing that wraps and unwraps. A filter is a thing that transforms. The pipeline connects these abstractions without any of them knowing about the others.

Adding a new codec does not require changing the pipeline. Adding a new container does not require changing the codecs. The architecture scales to a hundred formats because it was designed for one, and the design happened to be correct.

Bellard built the foundation in 2000 and moved on. Niedermayer and the community have maintained it since 2004. The foundation has not changed because it does not need to. The universal translator remains universal, and everything else remains a wrapper.

One tool. Every format. 1.5 million lines of C with defect rates thirty times below average. Powers two planets. No subscription. FFmpeg does not compete with multimedia tools. It is the multimedia tool.