The Invoice ■ Episode 07
“No servers to manage.”
A sentence of admirable marketing precision. No servers to manage. The servers are very much present. They boot, they execute your function, they die. They are Firecracker MicroVMs (lightweight virtual machines that spin up in under 125 milliseconds, run your code in a sandboxed environment, and are destroyed the moment the invocation completes). Servers with amnesia. The hardware did not disappear. The responsibility for naming it did.
One does not eliminate infrastructure by rebranding it. One relocates the complexity: from your operations team to your cloud provider’s pricing spreadsheet, from your deployment pipeline to four incompatible vendor APIs, from your monitoring dashboard to a distributed log system that charges per gigabyte ingested. The servers are still there. They simply have better public relations now.
Four Providers, Four Dialects, Zero Portability
Lambda. Azure Functions. Cloud Run. Cloudflare Workers. Four platforms, four event formats, four deployment models, four completely incompatible APIs. Your function does not run on “the cloud.” It runs on a specific vendor’s interpretation of what a function invocation should look like, wrapped in a proprietary event schema that is portable in exactly the way a Betamax tape is compatible with a VHS player.
The lock-in is not contractual. It is architectural. Your code is not
merely deployed on AWS; it is written in AWS. The event
object, the context object, the handler signature, the deployment packaging,
the IAM role assumption, the layer system, the cold start mitigation strategies:
all of it is Lambda-specific. Moving to Azure Functions is not a migration.
It is a rewrite. The vendor did not trap you. You trapped yourself, one
event.Records[0].s3.bucket.name at a time.
The industry calls this “multi-cloud strategy.” In practice it means choosing which vendor dialect to write your code in, then living with the consequences. The Esperanto of serverless does not exist. Each provider is a walled garden with excellent documentation for the wall.
The Work Relocated
Serverless eliminates operations, they said. Let us examine what it actually eliminated and what it quietly relocated.
Eliminated: SSH-ing into a box to restart a process. Patching
the operating system. Rotating certificates manually. Fair enough. These are
genuine improvements. Nobody misses sudo apt-get upgrade at
three in the morning.
Relocated: Everything else. The work did not vanish. It changed shape. Instead of configuring nginx, you configure API Gateway. Instead of reading syslog, you read CloudWatch (at $0.50 per gigabyte ingested). Instead of tuning process memory, you tune function memory in 128 MB increments while a pricing calculator runs in the adjacent browser tab. Instead of monitoring a server, you monitor cold starts, concurrency limits, timeout thresholds, reserved capacity, provisioned throughput, and the twelve other dials that replaced the one server you were trying to avoid.
The operational complexity did not decrease. It was redistributed across a larger surface area, each fragment billable, each fragment requiring its own expertise, its own documentation, its own certification. You did not eliminate the sysadmin. You replaced one sysadmin with a FinOps engineer, a solutions architect, and a CloudWatch bill.
The Scale Nobody Needs
“But it scales to millions of requests!” A defence so popular it deserves the arithmetic it never receives.
SimilarWeb publishes traffic estimates. Let us look at two German e-commerce operations that are not small by any reasonable measure.
Otto.de, Germany’s second-largest online retailer. Approximately 50 million monthly visits. That is roughly 19 requests per second on average. Nineteen. Not nineteen thousand. Nineteen.
Alternate.de, a major electronics retailer. Approximately 4 million monthly visits. That translates to 1.5 requests per second. One and a half.
An nginx instance on a single core handles 10,000 or more requests per second without breaking a sweat, without scaling to zero, without cold starts, without an auto-scaling policy, without a pricing calculator. The scale argument for serverless falls apart the moment you divide monthly traffic by the seconds in a month and compare the result to what commodity hardware has handled since 2004.
The Cold Start Tax
A function that has not been invoked recently is “cold.” The next invocation must provision a MicroVM, load the runtime, initialise the function, and execute. According to lambda-perf, the community benchmark for Lambda cold starts, the numbers are instructive:
Python, Node.js: 100 to 300 milliseconds. A quarter of a second before your function runs. Manageable, provided you do not mind your users waiting for a VM to boot every time traffic dips and recovers.
Java with Spring: up to 6 seconds. Six seconds of a user staring at a spinner because a virtual machine had to load a dependency injection framework inside a MicroVM inside a hypervisor inside a data centre. The layers of abstraction have their own geological strata at this point.
The mitigations are revealing. Provisioned Concurrency, AWS’s solution to cold starts, keeps functions warm by running them continuously. You pay for compute that is always on, waiting for requests. The industry term for a function that is always on, waiting for requests, is a server. The circle is complete.
The Pricing Illusion
Lambda pricing looks seductive: $0.20 per million requests, plus $0.0000166667 per GB-second of compute. Fractions of a cent. Pennies for millions. The back-of-napkin arithmetic is irresistible until you apply it to constant traffic.
A modest API handling 100 requests per second (a workload that a single Caddy process handles while checking its email) generates 8.64 million requests per day. 259 million per month. At 256 MB and 200 ms average duration: roughly $220 per month in compute, plus $52 in request fees, plus CloudWatch, plus API Gateway, plus the egress that nobody remembers to include until the invoice arrives. Call it $350 per month, conservatively, for a workload that runs comfortably on a €5 VPS.
The pricing model rewards intermittent, bursty workloads: a webhook that fires twice a day, a cron job that runs at midnight, a prototype with twelve users. For anything resembling constant traffic, you are paying a premium for the privilege of not owning a server that would have cost less than the monitoring bill.
The Environmental Footnote
Each invocation boots a MicroVM. Each MicroVM consumes CPU cycles to initialise, allocate memory, load the runtime, execute, and terminate. For a function invoked a million times a day, that is a million boot sequences, a million teardowns, a million allocations. The overhead is not free. The electricity is not free. The cooling for the data centre running a million ephemeral VMs is not free.
A single server, idling at 10 watts, handles the same workload continuously. It does not boot. It does not die. It does not require a hypervisor to babysit its lifecycle. It simply runs, in the profoundly boring way that efficient systems have always run.
Nobody audits the carbon footprint of a million cold starts. Perhaps they should.
The Alternative
Two FreeBSD servers. Caddy in front. Jails for isolation. €40 per month. No cold starts. No vendor lock-in. No four incompatible event formats. No FinOps specialist decoding the invoice. No provisioned concurrency pretending not to be a server.
10,000 or more requests per second. Logs in one place. Deployments via
scp and a process restart. The entire operational model fits
in a single paragraph because there is nothing to explain. The
complexity was never added in the first place.
Netflix serves 700,000 or more requests per second to a quarter of a billion subscribers. Their delivery infrastructure, Open Connect, runs on FreeBSD, bare metal, jails. Not Lambda. Not serverless. Not MicroVMs booting and dying per request. The company that defines scale at its most extreme chose the opposite of serverless for the work that matters.
The Verdict
Serverless has a context. A webhook that fires six times a day. A prototype with a handful of users. A data pipeline that runs at 2 a.m. and sleeps until next Tuesday. In these contexts, paying per invocation is rational, even elegant.
For the other 90 per cent (the API with steady traffic, the web application that serves actual users, the service that runs continuously because that is what services do), serverless is managed hosting with exceptional marketing. You are renting MicroVMs that boot and die per request, writing code in a vendor dialect that goes nowhere else, debugging distributed logs that charge by the gigabyte, and calling it progress.
Serverless did not eliminate the server. It eliminated your ability to see it, control it, and compare the price to what a €20 VPS would have cost.
The server is still there. It simply sends a more complicated invoice now.