Vivian Voss

The Original Microservices

unix architecture

grep was written in 1973. awk in 1977. sed in 1974. sort, uniq, cut, wc: all before 1980. Each does one thing. Each takes text in and puts text out. Each composes with any other through a pipe.

The industry spent the better part of a decade building vastly more complex solutions for what these tools had been doing all along. Then, in 2014, it gave the pattern a name: "Microservices." One might observe that the naming ceremony arrived roughly forty-one years late.

The Pattern

Every microservices tutorial teaches the same handful of principles as though they were recently discovered. Open any conference talk from the past decade, and you will find the same vocabulary: single responsibility, loose coupling, well-defined contracts, independent deployability. All perfectly reasonable ideas. All present in Unix since 1978, when Doug McIlroy wrote them down as though they were self-evident truths. Which, to be fair, they rather were.

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.

That is not a microservices manifesto. That is McIlroy, 1978. The resemblance to modern architectural guidance is not coincidental. It is genealogical.

Consider the mapping. Each Unix tool has a single responsibility: grep filters, sort sorts, uniq deduplicates. The API contract is stdin and stdout: text in, text out, the universal interface. The message queue is the pipe: one process writes, the next reads, backpressure built in at the kernel level. Service discovery is $PATH. Orchestration is the shell script. Observability is tee: tap into any data stream between any two services, live, with zero configuration.

Unix (1973-1978) Modern Equivalent Overhead pipe | Kernel buffer, zero config Message Queue Kafka, RabbitMQ, SQS JVM + broker cluster 6-32 GB RAM stdin / stdout Text in, text out API Contract REST, gRPC, GraphQL Schema registry + gateway Serialisation overhead $PATH Shell resolves binaries Service Discovery Consul, etcd, DNS-SD Distributed consensus Raft protocol, heartbeats sh / xargs -P Sequential + parallel Orchestration Kubernetes, Nomad, ECS 12-24 GB control plane etcd + scheduler + API tee Tap any stream, live Observability ELK, Datadog, Splunk $150+/GB/day (Splunk) $15-23/host/mo (Datadog) exit codes 0 = success, nonzero = fail Health Checks HTTP probes, circuit breakers Sidecar proxies Envoy, Istio mesh jails (2000) Containers Daemon + hub + subscription

Every row in that table tells the same story: a kernel-native mechanism, battle-tested for decades, repackaged with a distributed system bolted on top. The original does not require a cluster. It requires a shell.

The Demonstration

Consider a practical scenario. You want the twenty most frequently requested URLs that returned a 404. In the microservices world, this is an observability problem. You will need a log shipper, an indexing cluster, a query language, and a dashboard. In the Unix world, it is a pipeline:

grep 404 access.log | cut -d' ' -f7 | sort | uniq -c | sort -rn | head -20

Six processes. One pipeline. Each stateless, composable, and replaceable. No configuration files, no YAML, no cluster. The data flows left to right through the pipe, each tool performing its single transformation and handing the result to the next.

The industry will happily sell you the Elastic Stack for this. Elasticsearch alone recommends 4-8 GB of heap memory. grep uses approximately 2 MB of resident memory. One is not like the other.

UNIX PIPELINE 6 processes, 1 pipe, ~12 MB total grep 404 | cut -d' ' -f7 | sort | uniq -c | sort -rn | head -20 stdout → result MODERN EQUIVALENT 12+ services, 20-40 GB RAM API Gateway Service Mesh Log Shipper Message Queue Elasticsearch 4-8 GB heap Kibana Container Runtime Orchestrator 12-24 GB control plane Image Registry Service Discovery Config management Secrets, env vars, vaults CI/CD pipeline Build, test, deploy, rollback Same result. One line of shell. Same result. A team, a budget, and a prayer.

The diagram is not a caricature. It is a conservative rendering. A production ELK deployment typically includes a reverse proxy, an ingest pipeline, index lifecycle management, alerting, and role-based access control. The Unix pipeline includes none of these because it does not need them. The data never leaves the machine.

The Repackaging

Docker took cgroups and namespaces (kernel features available since the 1990s) and added a daemon, a hub, and a subscription model. FreeBSD jails achieved process isolation in 2000, without a background daemon and without phoning home. Kubernetes buried iptables, cgroups, and etcd under 12-24 GB of control plane RAM. The kernel features are still doing the actual work. They are simply harder to see now, which is rather the point.

This is not an argument against containerisation or orchestration as concepts. Large distributed systems face genuine problems that a single machine cannot solve. The objection is narrower and, one hopes, more interesting: the industry did not merely adopt Unix principles. It repackaged them, not to make them more accessible, but to make them more billable. The abstractions rarely subtract complexity. They redistribute it, usually into places where it is harder to inspect and more expensive to debug.

The Invoice

It is worth placing the numbers side by side, not to be provocative, but because the comparison itself is instructive. The left column has been compiling and running on production systems for half a century. The right column requires a procurement process.

The Splunk pricing page is an education in itself. At $150 or more per gigabyte per day, a moderately busy web server generating 10 GB of logs daily would cost $1,500 per day for the privilege of searching text, a task that grep, a fifty-three-year-old binary, performs in milliseconds for the price of a few megabytes of RAM. Datadog charges $15-23 per host per month for metrics that ps, awk, and cron have collected since before most of its engineers were born.

One might argue that these commercial platforms offer features beyond raw search: dashboards, alerting, anomaly detection, retention policies. This is true. One might also argue that a significant proportion of their customers use approximately none of these features and are, in effect, paying enterprise rates for grep.

The Uncomfortable Conclusion

The tools are compiled. They are optimised. They have been battle-tested for half a century. They implement the microservices pattern in its purest form: small, stateless, composable processes communicating through a universal text interface with built-in backpressure. The pattern was never lost. It was never even obscure. It was taught in every introductory Unix course for decades.

The industry did not ignore the Unix philosophy. It studied it, extracted the principles, wrapped them in YAML, and sold them back with a support contract. Not to make them more accessible (the man pages were always free) but to make them more billable.

The tools are there. They always were. The question is not whether they work. The question is whether anyone still bothers to look.