grep
was written in 1973.
awk
in 1977.
sed
in 1974.
sort, uniq, cut, wc: all before 1980.
Each does one thing. Each takes text in and puts text out. Each composes with any other
through a pipe.
The industry spent the better part of a decade building vastly more complex solutions for what these tools had been doing all along. Then, in 2014, it gave the pattern a name: "Microservices." One might observe that the naming ceremony arrived roughly forty-one years late.
The Pattern
Every microservices tutorial teaches the same handful of principles as though they were recently discovered. Open any conference talk from the past decade, and you will find the same vocabulary: single responsibility, loose coupling, well-defined contracts, independent deployability. All perfectly reasonable ideas. All present in Unix since 1978, when Doug McIlroy wrote them down as though they were self-evident truths. Which, to be fair, they rather were.
Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.
That is not a microservices manifesto. That is McIlroy, 1978. The resemblance to modern architectural guidance is not coincidental. It is genealogical.
Consider the mapping. Each Unix tool has a single responsibility: grep
filters, sort sorts, uniq deduplicates. The API contract
is stdin and stdout: text in, text out, the universal
interface. The message queue is the pipe: one process writes, the next reads,
backpressure built in at the kernel level. Service discovery is $PATH.
Orchestration is the shell script. Observability is tee: tap into
any data stream between any two services, live, with zero configuration.
Every row in that table tells the same story: a kernel-native mechanism, battle-tested for decades, repackaged with a distributed system bolted on top. The original does not require a cluster. It requires a shell.
The Demonstration
Consider a practical scenario. You want the twenty most frequently requested URLs that returned a 404. In the microservices world, this is an observability problem. You will need a log shipper, an indexing cluster, a query language, and a dashboard. In the Unix world, it is a pipeline:
grep 404 access.log | cut -d' ' -f7 | sort | uniq -c | sort -rn | head -20
Six processes. One pipeline. Each stateless, composable, and replaceable. No configuration files, no YAML, no cluster. The data flows left to right through the pipe, each tool performing its single transformation and handing the result to the next.
The industry will happily sell you the
Elastic Stack
for this. Elasticsearch alone recommends 4-8 GB of heap memory.
grep uses approximately 2 MB of resident memory. One is not like the other.
The diagram is not a caricature. It is a conservative rendering. A production ELK deployment typically includes a reverse proxy, an ingest pipeline, index lifecycle management, alerting, and role-based access control. The Unix pipeline includes none of these because it does not need them. The data never leaves the machine.
The Repackaging
Docker took cgroups and namespaces
(kernel features available since the 1990s) and added a daemon, a hub,
and a subscription model. FreeBSD jails
achieved process isolation in 2000, without a background daemon and without phoning home.
Kubernetes
buried iptables, cgroups, and etcd under
12-24 GB of control plane RAM. The kernel features are still doing the actual work.
They are simply harder to see now, which is rather the point.
This is not an argument against containerisation or orchestration as concepts. Large distributed systems face genuine problems that a single machine cannot solve. The objection is narrower and, one hopes, more interesting: the industry did not merely adopt Unix principles. It repackaged them, not to make them more accessible, but to make them more billable. The abstractions rarely subtract complexity. They redistribute it, usually into places where it is harder to inspect and more expensive to debug.
The Invoice
It is worth placing the numbers side by side, not to be provocative, but because the comparison itself is instructive. The left column has been compiling and running on production systems for half a century. The right column requires a procurement process.
The Splunk
pricing page is an education in itself. At $150 or more per gigabyte per day, a moderately
busy web server generating 10 GB of logs daily would cost $1,500 per day for the privilege
of searching text, a task that grep, a fifty-three-year-old binary,
performs in milliseconds for the price of a few megabytes of RAM.
Datadog
charges $15-23 per host per month for metrics that ps,
awk, and cron have collected since before most of its
engineers were born.
One might argue that these commercial platforms offer features beyond raw search:
dashboards, alerting, anomaly detection, retention policies. This is true. One might
also argue that a significant proportion of their customers use approximately none of
these features and are, in effect, paying enterprise rates for grep.
The Uncomfortable Conclusion
The tools are compiled. They are optimised. They have been battle-tested for half a century. They implement the microservices pattern in its purest form: small, stateless, composable processes communicating through a universal text interface with built-in backpressure. The pattern was never lost. It was never even obscure. It was taught in every introductory Unix course for decades.
The industry did not ignore the Unix philosophy. It studied it, extracted the principles, wrapped them in YAML, and sold them back with a support contract. Not to make them more accessible (the man pages were always free) but to make them more billable.
The tools are there. They always were. The question is not whether they work. The question is whether anyone still bothers to look.