Vivian Voss

The Compute-Is-Cheap Decade

performance architecture cloud

Beta Stories ■ Episode 08

In 1974, Donald Knuth wrote a sentence that would later be deployed to justify rather a lot of mediocre software:

"Premature optimisation is the root of all evil."

A splendid line. Short, memorable, and almost invariably quoted without the qualifier that precedes it and the one that follows. The full sentence, for those with the constitution to read it, runs rather differently:

"We should forget about small efficiencies, say about 97% of the time: premature optimisation is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

What Knuth Actually Wrote The 97% kept "premature optimisation is the root of all evil" universally quoted, weaponised The 3% forgotten "we should not pass up opportunities in that critical 3%" the clause that justified the work The 3% disappeared. Only the 97% survived. An entire industry built its habits on the surviving half.

The 3% disappeared. Only the 97% survived. An industry built its habits on the surviving half. Combined with Moore's Law on the hardware side and an invisible cloud bill on the procurement side, a generation of developers grew up assuming computational waste was a solved problem. Hardware, after all, is cheap.

The Decay

Median desktop web page weight in 2010 was roughly 500 KB; in 2025 it is 2.9 MB. A 5.3x inflation for, very often, the same paragraph of text and a photograph. One does note that the paragraph has not grown.

Slack, originally built to send short messages, routinely consumes well above 1 GB of RAM on a modern desktop. Discord took the same lesson in the opposite direction. Until 2020, their Go-based Read States service suffered from garbage-collector spikes that fleets of boxes could barely contain. They rewrote it in Rust. Memory dropped by 40 per cent. Latency improved 6.5x in the best case and 160x in the worst. Same product. Less hardware. The difference was not magic. It was the willingness to look at what was running.

The 3% Is Still Out There Discord 2020 -40% memory on Read States 160x worst-case latency Go → Rust. Same product. Twitter 2022 148,000 servers decommissioned -60 to -75% cloud spend Sacramento DC exit. The site still loaded. WhatsApp 2015 900M users 50 engineers (1:18M) FreeBSD + Erlang. Acquisition ended it. The ratio was deliberate. Most companies do not aim for it; most do not consider that they could.

Set politics aside: in November 2022, Elon Musk dismissed roughly 80 per cent of Twitter's staff and demanded $1 billion in annual infrastructure cuts. Twitter exited its Sacramento data centre, decommissioning 5,200 racks and 148,000 servers. Cloud spend dropped 60 to 75 per cent. The Scala recommendation engine was rewritten in Rust, and the resulting Home Mixer ran roughly 10x faster than its predecessor. The site continued to function. Faster than before, in fact. One does pause at this.

WhatsApp, for context, served 900 million users in 2015 with fifty engineers. A ratio of one engineer per eighteen million users. The stack was FreeBSD with Erlang, ejabberd, the BEAM virtual machine, and Mnesia. The ratio was deliberate. Most companies do not aim for it; most companies do not consider that they could.

Then Facebook bought it. The stack moved from FreeBSD to Linux, not because Linux was technically preferable, but because it fit the new owner's containerised operations. By 2024, the fifty engineers had become roughly two and a half thousand. The architecture survived its original engineers. It did not survive the acquisition.

The Mechanism

The cloud abstracts cost away from the developer. An additional instance is a single API call; the bill arrives at someone else's desk, usually accompanied by an emailed PDF and a quarter-end conversation with procurement. By 2010, the mechanism was complete: Moore's Law on the hardware side, AWS on the procurement side, and "premature optimisation" on the cultural side. Each reinforced the others.

Every generation since has trained on frameworks, never the layers underneath. React, Next.js, an npm tree of 1,400 packages, and a build script that needs four gigabytes of memory to produce a landing page. The runtime cost is invisible until the cloud bill becomes visible, at which point the architecture is too large to change and the person who could have asked the question has been promoted out of asking it.

Bootcamps, in particular, optimise for time-to-employability. That means teaching the framework, not the model underneath it. The student ships, the company hires, and the question of what happens at runtime is deferred indefinitely. By the time the engineer is senior enough to ask it, the system is large enough that nobody else wants to.

The Signal

Your build script needs 4 GB of RAM. Your editor consumes 800 MB to display text. The architecture review concludes "we'll scale horizontally". A senior engineer suggests the data structure could be smaller, and is politely informed that this is premature optimisation.

The single most reliable signal is the question that never gets asked: how much does this cost to run? Not in development hours, not in team size, but in watts, in memory, in the laptop fan currently audible under your fingertips.

The Bill

Globally, data centres consumed 415 TWh of electricity in 2024, approximately 1.5 per cent of all electricity produced on the planet. In 2025 the figure rose 17 per cent year on year, against 3 per cent growth in general electricity demand. The IEA projects 945 TWh by 2030: roughly twice today's figure, and four times the growth rate of all other sectors combined. AI is responsible for 5 to 15 per cent of data centre power today, projected to reach 35 to 50 per cent by 2030.

Global Data Centre Electricity — TWh per Year 2024 415 TWh 1.5% of global supply 2025 ~486 TWh +17% YoY (vs 3% general) 2030 945 TWh AI share: 5-15% today → 35-50% by 2030. Growth rate: 4x every other electricity sector combined. Hardware is not free. It was simply cheap enough that nobody bothered to look at the bill.

These numbers are not abstract. They translate into transformer stations, transmission lines, cooling water, and the carbon budget of countries that made political commitments years ago and now find themselves in an arithmetic problem.

Hardware is not free. It was simply cheap enough that nobody bothered to look at the bill. The bill, as it turns out, is paid by the grid, the cooling, the climate, and the laptop fan currently spinning under your fingertips.

Knuth's full quote was 97%/3%. Only the 97% survived. Discord rewrote Go in Rust: memory down 40%, latency 160x. Twitter decommissioned 148,000 servers and continued to function. WhatsApp 2015: fifty engineers for 900 million users. Global data centres: 415 TWh in 2024, 945 TWh projected by 2030. The bill was always there.