The job description is admirably brief: a request arrives, HTML leaves. One job. One process ought to do. And yet, over a quarter of a century of accretion, the standard PHP deployment has arrived at something rather more elaborate.
The Seven Layers
Consider what the typical PHP production environment actually requires before it can perform this single, modest task:
Nginx: because PHP cannot serve HTTP on its own. A language designed for the web that requires a separate programme to participate in it. One might call this an architectural decision. One might also call it an admission.
PHP-FPM : because the language needs an external process manager to handle concurrent requests. The FastCGI Process Manager spawns worker pools, manages lifetimes, and consumes 300-600 MB of memory merely to supervise a language that cannot supervise itself.
OPcache : because interpreting the same source files on every request proved, unsurprisingly, too slow. The solution: cache the compiled bytecode between requests, which is to say, approximate what a compiled language does by default. PHP 8's JIT compiler improves synthetic benchmarks by up to eleven times, but real web workloads see a rather more modest 1.8% improvement. The bottleneck, it turns out, was never the CPU.
Composer + vendor/: because the language ships without a standard library worth mentioning. Need to send an HTTP request? Install a package. Parse a date? Package. Validate an email? Package. The vendor/ directory of a typical Laravel application weighs 50-200 MB. That is not a dependency tree. It is a dependency forest, and no one has drawn the map.
A framework: because the language provides no routing, no request lifecycle, no structure. Laravel, Symfony, or one of their many imitators must supply what the language itself declined to.
A template engine: because mixing business logic and HTML markup was eventually recognised as poor form. Blade, Twig, or a bespoke alternative now separates concerns that a well-designed language would never have entangled.
Redis: because PHP forgets everything after each request. The shared-nothing architecture, once presented as a feature, requires an external in-memory store to maintain any state whatsoever between page loads. Sessions, caches, queues: all outsourced to a separate system because the primary one has the long-term memory of a goldfish.
Seven layers. Each one a patch on the one below. Each one adding latency, memory, configuration surface, and failure modes. Not one of them addresses the root cause. They merely make the symptoms more comfortable.
The Diagram
It is worth seeing the architecture drawn plainly. On the left, the standard PHP deployment: seven distinct components, each compensating for a limitation in the one beneath it. On the right, the compiled alternative.
The question is not which layer to optimise. The question is which of these seven layers would exist at all if the starting point had been a compiled binary.
The answer, rather inconveniently for the ecosystem, is none of them.
The Uncluttered Methods Pattern
For over fifteen years I have followed what I call UMP: the Uncluttered Methods Pattern. The principle is simple enough to fit on a napkin, and rather difficult to argue with once you have read it:
Eliminate every layer that exists only to compensate for the one below it.
This is Occam's razor applied to system architecture. Not "move fast and break things." Not "keep it simple, stupid." Something more precise: every layer in your stack must justify its existence on its own merits, not as an apology for the layer beneath.
Apply this to the standard web stack, and what remains is a compiled binary that routes, renders, scripts, and serves. No process manager. No bytecode cache. No dependency forest. No external state store. One binary. One process. One job.
The Numbers
Theory is pleasant. Arithmetic is persuasive. Here is what CASTD , a compiled Rust web server with Lua scripting and built-in templating, actually measures in production:
CASTD (real, measured, running):
- Binary size: 891 KB. Multithreaded. Vertically scaling.
- Memory: 1.8 MB idle, 4 MB under 200 concurrent connections.
- Cold start: <1 ms.
One Laravel stack (Nginx + PHP-FPM + OPcache + Redis + Cron + Queue):
- Deploy size: 50-200 MB (vendor/ alone).
- Memory: 300-600 MB (FPM worker pool).
- Cold start: 300 ms.
Now scale that to an enterprise deployment. Eight application servers:
PHP stack: approximately 19 GB of memory. CASTD: approximately 32 MB. The ratio is not a rounding error. It is a factor of 600.
One might reasonably ask what the other 18.97 GB are doing. The answer, of course, is compensating.
Precedent at Scale
This is not a novel pattern. It is not even a controversial one, if you look at who else uses it.
Cloudflare runs their entire CDN on precisely this architecture: a compiled core extended with Lua, built on OpenResty. Forty million websites. The pattern is not theoretical. It is infrastructure-grade, battle-tested, and rather thoroughly proven.
The difference is that Cloudflare uses it to serve other people's websites. CASTD uses it to replace the stack those websites run on.
The Uncomfortable Arithmetic
The PHP ecosystem will point out, correctly, that it powers a vast percentage of the web. WordPress alone accounts for over 40% of all websites. This is true, and it is also irrelevant. Market share is not a technical argument. Internet Explorer once had 95% market share. The question is not what is popular. The question is what is necessary.
Seven layers to serve HTML. Each layer a product. Each product a team. Each team a budget. Each budget a stakeholder who would prefer the problem to remain unsolved.
The compiled binary has no such stakeholders. It simply does the job.
One binary. One process. One job.
The rest is archaeology.