Vivian Voss

The Dependency Avalanche

security node freebsd tooling

Beta Stories ■ Episode 09

The promise of the modern package ecosystem was a kind one: you do not have to write everything yourself. Stand on the shoulders of giants. Reuse, do not reinvent. Anyone, somewhere, maintains it.

The reality, measured this morning on a fresh laptop, is the avalanche this episode is named for.

The Reality, in Two Numbers

npm install express on a clean directory pulls Express 5.2.1, declares 28 direct dependencies, resolves a total of 65 packages across the dependency tree, and produces a node_modules of 3.6 MB. That is the smaller end of the modern web.

npx create-next-app@latest with the recommended defaults (TypeScript, Tailwind CSS, ESLint, App Router) creates a project whose package.json declares 11 packages, resolves a transitive tree of 644 packages, and writes a node_modules of 463 MB. The application, at this stage, renders a single page that says "Welcome to Next.js".

Packages Resolved for a Blank HTTP Responder Next.js (defaults) 644 463 MB on disk Express 65 3.6 MB on disk Rust hyper / axum ~5 explicit Cargo.toml, signed off in writing Go net/http 0 stdlib, 7 MB static binary Six hundred and forty-four pieces of someone else's work to render the words "Welcome to Next.js".

Six hundred and forty-four pieces of someone else's work to render eighteen characters of text. Each package authored, in principle, by a stranger; in practice, sometimes by a small group of strangers; in a handful of unpleasant cases, by an account that has been waiting to be useful.

For comparison: the same minimal HTTP responder in Rust uses hyper (the lower-level HTTP foundation) or axum on top of it, and keeps its dependency tree small and explicit in Cargo.toml, where every crate is signed off in writing before it lands. The Go equivalent uses the standard library's net/http, pulls zero external dependencies, and builds, statically linked, to a single 7 MB binary. FreeBSD ships a base userland audited as one source tree, plus a ports collection where every port has a named maintainer and a fully declared dependency graph. All three have existed for years. None of them asks the team to import a stranger's recursion.

The Mechanism

The phrase that does the damage, repeated almost reflexively, is "we do not maintain it; the upstream maintainer does".

Which means, when one reads it carefully: nobody on the team has read the code. Nobody has reviewed the commit history. Nobody has audited the build script. Nobody has checked who has commit access. Nobody has asked what the package was last week, or what it will be next week. The entire audit duty has been outsourced, in writing, to a name on an npm registry page.

A 644-package install is 644 outsourced audits. The team that ships it has, in good faith, agreed that some other unnamed group will do the reading on their behalf. The other unnamed group, when one goes to look, is mostly volunteers who themselves have not been paid to read the code below them either. The audit duty has been outsourced so many times that nobody, by the end of the chain, is holding it.

This is the Beta Stories mechanism: software gets worse not by mistakes, but by accumulation. The decay is in the count. Every package added is a piece of code that the team has decided not to read.

The Audit That (Barely) Happened

In January 2021, an account named "Jia Tan" was created on GitHub. It submitted small, useful patches to xz-utils, the compression library that ships in essentially every major Linux distribution and underlies a great many compressed-archive operations across the Unix world. The patches were good. The maintainer at the time, Lasse Collin, accepted them, as one accepts good patches.

By the summer of 2022, three accounts (Jia Tan, Dennis Ens, and Jigar Kumar) were active in the xz-utils mailing lists and issue tracker, applying coordinated pressure on Lasse Collin to add an additional maintainer with commit rights. Lasse, by his own admission, was burnt out, unpaid, and had been carrying the project alone for years. Jia Tan was eventually granted those rights. The timeline compiled by Russ Cox of Google reads, in retrospect, like a textbook.

Two and a Half Years of Patience Jan 2021 Jia Tan account created Oct 2021 first patches to xz-utils Summer 2022 sock-puppet pressure commit rights granted Feb-Mar 2024 backdoor in 5.6.0 and 5.6.1 29 Mar 2024 Andres Freund discloses CVSS 10.0. Backdoor reached Debian unstable, Fedora rawhide, openSUSE Tumbleweed, Ubuntu testing, Kali rolling. Days, in some cases hours, away from stable releases on hundreds of millions of servers.

In February and March 2024, Jia Tan committed the backdoor. The payload was inserted into xz-utils 5.6.0 (released 24 February) and 5.6.1 (released 9 March), hidden inside test fixtures and triggered through the autotools build system in a way that left the source tarball pristine to a casual reader. The build, when invoked in the particular way that distribution maintainers happened to invoke it, linked malicious object code into liblzma. liblzma was, in turn, linked into sshd on systemd-based distributions via the systemd-notify integration. The result was a remote-code-execution backdoor in OpenSSH, triggerable by any client with the right key. CVSS 10.0. One does not, very often, get to say CVSS 10.0 with a straight face.

The Build Chain to sshd test fixtures in xz-utils 5.6.0 autotools build trigger liblzma malicious object systemd-notify links liblzma into sshd OpenSSH (sshd) RCE for any client with the key

By 29 March 2024, the backdoored versions had already reached Debian unstable, Fedora 40 rawhide, openSUSE Tumbleweed, Ubuntu testing, and Kali Linux. They were days, in some cases hours, away from rolling into stable releases that would have been on hundreds of millions of servers.

On 28 March 2024 (the public posting was on the 29th), Andres Freund, a PostgreSQL maintainer at Microsoft, noticed that his SSH login on a Debian unstable system was taking approximately 500 milliseconds longer than usual. He had recently been benchmarking Postgres builds and was attentive to small latency drifts. He ran the login under valgrind, found memory-access errors pointing at liblzma, traced the chain back through the autotools build, found the obfuscated payload in the test fixtures, and posted the disclosure to the oss-security mailing list that night.

The internet was saved by one engineer noticing half a second of latency he did not expect.

The Signal

Two and a half years of patient social engineering against an unpaid maintainer. A chain of distribution maintainers signing off on routine version bumps. CI pipelines, test suites, code-review tools, all returning green. SBOM generators producing clean reports. The audit duty had been outsourced so many times that nobody, by the end of the chain, was holding it.

The signal one watches for, after XZ, is not "is there malicious code in your dependencies". That signal is impossible to read directly. The signal is who is doing the reading. If the answer is "the upstream maintainer", ask: who is the maintainer, what is their funding, what is their burnout level, who has commit access, what has changed in the past six months. If those questions cannot be answered for a package one ships to production, one is in the avalanche zone.

The Boring Counter-Move

The counter to the dependency avalanche is not "audit every package", which is plainly impossible at 644 packages. It is fewer packages. Each one read, each one chosen, each one with a known maintenance posture.

The Go standard library is one approach: a curated, audited, batteries-included foundation that removes 80 per cent of the reasons one would reach for a third-party package in the first place. Rust takes the opposite end of the same idea: a deliberately small standard library plus a Cargo.toml that any reviewer can read in a single sitting, with every dependency declared in writing and every transitive crate visible in Cargo.lock. The FreeBSD base system is a third: a coherent userland built and audited as one project, where what ships in /usr/sbin/sshd is the same source tree as what ships in /usr/bin/find and /usr/bin/awk, maintained by the same project, released together. The Python standard library, before the wheel-and-PyPI culture took over, was something similar. So was the venerable Unix toolkit.

None of this scales to "build any product, in any language, in any week". It scales to "build a sustainable system, with a known maintenance posture, over decades". The Beta Stories question is whether one has been paying for the first while quietly pretending it is the second.

The Closer

XZ is the post-mortem the industry will keep referencing because it nearly worked. The patient, careful, well-funded version of the same attack will work, and one will not notice in time. The reason is not that the security tools have failed. The reason is that nobody is reading the code, and nobody has been for some time.

Next time it might not be 500 milliseconds. Next time it might be 50. Next time it might be no perceptible drift at all, until the breach notification arrives in the inbox.

Express: 65 packages, 3.6 MB. Next.js (defaults): 644 packages, 463 MB. Rust hyper/axum: a small explicit Cargo.toml. Go net/http: zero deps, 7 MB static binary. FreeBSD base: zero third-party. XZ CVE-2024-3094: 2.5 years of social engineering, CVSS 10.0, caught by 500 ms of SSH latency one engineer did not expect.