Vivian Voss

The Observability Tax

devops performance unix cloud

Performance-Fresser ■ Episode 18

syslog has existed since 1983. top, vmstat, netstat, ping: built into every Unix system since before most of today's engineers were born. Zero overhead. Zero cost. Zero dependencies.

In 2026, a mid-sized company with 100 engineers spends $708,000 to $1,080,000 per year to know whether its servers are running. 97 per cent report unexpected cost surprises. 67 per cent report them regularly.

One does admire an industry that turned tail -f into a seven-figure problem.

The Stack

Consider the modern observability pipeline. Prometheus scrapes metrics every 15 seconds. At one million time series: 3 GB of RAM for storage alone. The OpenTelemetry Collector: 2 CPU cores and 2 GB as its baseline, handling a modest 20,000 spans per second. Sentry's JavaScript SDK: 53 KB per page load to watch whether the page loads quickly. An observer that slows down the thing it observes. Heisenberg would have appreciated the irony.

Then Grafana to visualise what Prometheus scraped. Loki to store what Grafana displayed. Tempo to trace what Loki logged. AlertManager to notify when Tempo found something. PagerDuty to escalate what AlertManager notified. Slack to discuss what PagerDuty escalated. A marvellous pipeline, that. Entirely dedicated to watching.

The Observability Pipeline Your App the thing Prometheus scrapes Grafana visualises Loki stores Tempo traces AlertManager notifies PagerDuty escalates Slack discusses Engineer investigates Seven tools. One question: is the server running? In 1983, the answer was: tail -f /var/log/messages

Prometheus scraping alone increases Kubelet CPU by a factor of four. The observer changes the observed. One would think physicists settled that question a century ago.

The Invoice

Observability vendors charge $0.10 to $3.00 per gigabyte ingested. Cloud storage costs a fraction of a cent. A 5x to 150x markup for the privilege of looking at your own data through someone else's dashboard.

Some organisations spend 70 per cent of their infrastructure budget observing their infrastructure. The watchtower now costs more than the castle. Quite the inversion.

The Markup: Storage vs Observability Cloud storage $0.02/GB Vendor (low) $0.10/GB 5x markup Vendor (high) $3.00/GB 150x markup Infrastructure Budget Allocation Observability: 70% Actual infra: 30% The watchtower costs more than the castle.

The Retreat

Evereve left Datadog. Cost reduction: 90 per cent. Not a minor optimisation. A revelation about how much of their monitoring bill was margin rather than value.

Discover Financial migrated 2,200 services away from their vendor. Two thousand two hundred services. If the migration is that large and still worth doing, one might pause to consider what the vendor was charging.

Deductive AI had their access revoked overnight. Their monitoring vendor switched off their visibility. They rebuilt on open-source tooling in 48 hours. Rather clarifying, when the company that sells you visibility can unilaterally remove it.

The Retreat from Vendor Lock-in Evereve Left Datadog Cost reduction: 90% Not optimisation. Revelation Discover Financial 2,200 services migrated Away from vendor Worth the effort at that price Deductive AI Access revoked overnight Rebuilt in 48 hours Vendor sold visibility, then removed it

The Root Cause

Bootcamps teach Grafana dashboards, not vmstat. Universities accept vendor sponsorships, teach vendor stacks, and graduate engineers who have never opened a man page. A computer science graduate in 2026 has a fundamentally different foundation than one from 2006. Not better. Not worse. Different. Specifically, different in ways that happen to require a subscription.

One cannot sell a dashboard subscription to someone who knows what pipes do. So one stops teaching pipes.

The customer base was not discovered. It was cultivated.

Eric Allman wrote syslog in 1983. Ritchie and Thompson had logging in Unix V1 in the 1970s. A tail -f and three greps tell you more in thirty seconds than a dashboard that took a sprint to configure. The tools were always there. They merely required reading a man page instead of clicking a colourful dashboard.

The Observer Tax: Resource Overhead tail + grep ~0 MB since 1983 vmstat ~0 MB since 1986 Sentry JS SDK 53 KB/page watches if the page loads quickly OTel Collector 2 cores + 2 GB baseline Prometheus 3 GB RAM at 1M series Kubelet CPU: 4x increase from scraping alone

The Pattern

You needed to know if your server was healthy. You got a monitoring stack that needs its own monitoring. Prometheus watches your application. Who watches Prometheus? Another Prometheus instance, naturally. Quis custodiet ipsos custodes, but with YAML.

The tools were always there. Free, fast, composable. They merely required reading a man page instead of clicking a colourful dashboard. In 1983, Eric Allman gave Unix syslog. In 2026, the industry replaced it with a seven-figure subscription. The upgrade was not in capability. It was in invoicing.

A monitoring stack that needs its own monitoring is not observability. It is overhead with a dashboard. The tools were always there. The invoice was not.

Fewer dashboards. More fundamentals.