Vivian Voss

DTrace

unix freebsd performance

Technical Beauty ■ Episode 30

You have done this. Your production system misbehaves. The logs say nothing. The metrics say nothing. Everything is fine, except it is not. So you do what every developer does: you add a printf, rebuild, redeploy, and wait.

The bug does not recur. You add three more printfs. Rebuild. Redeploy. Wait. The bug recurs, but not where you expected. You are now debugging your debugging. It is a Tuesday. You had plans for the evening.

In 2001, Bryan Cantrill was doing exactly this at Sun Microsystems. He had helped build an entirely synthetic system: every instruction, every data structure, every byte placed there by human beings. And he could not ask it what it was doing. "We had created this system," he said, "yet we could not observe it."

That bothered him rather a lot.

The Problem

Debugging production systems has always offered two options, both terrible.

Option one: add logging statements, rebuild, redeploy, wait for the issue to recur. This works in development. In production, it means restarting a database serving ten thousand connections to add a printf. The cure is worse than the disease. And if you guessed wrong about where to log, you restart again. And again. Each restart is a deployment. Each deployment is a risk. Each risk is a conversation with someone who would rather you did not restart things.

Option two: attach a debugger. Stop the process, inspect memory, step through code. This works beautifully on a developer's laptop. On a production trading system processing four million transactions per hour, stopping the process is not debugging. It is an outage with extra steps.

Be honest. You have attached strace to a production process and watched it crawl to a halt. strace, written by Paul Kranenburg for SunOS in 1991 and ported to Linux by Branko Lankester, traces system calls by intercepting them via ptrace. It stops the traced process for every single syscall. Switches context to the tracer. Records the call. Resumes the process. Stops it again. It is rather like asking someone to describe their day by pausing them after every breath.

strace vs DTrace strace / truss Stops process per syscall Context switch overhead (brutal) One process at a time Tells you what happened Cannot see inside the kernel Makes the patient sicker DTrace Instruments kernel directly Zero overhead when disabled System-wide, all processes Tells you what and why Traces kernel + userland Safe by construction strace: surgery with the lights off. DTrace: X-ray vision. The patient does not notice.

truss on FreeBSD. strace on Linux. Both tell you what happened. Neither tells you why. And both make the patient sicker in the process of diagnosis.

The Answer

DTrace does not stop anything.

Bryan Cantrill, Mike Shapiro, and Adam Leventhal designed and built DTrace at Sun Microsystems. The original ideation began in the late 1990s. The implementation was first integrated into Solaris in September 2003. It shipped with Solaris 10 in January 2005. Development took approximately four years of focused engineering.

The tagline: "Concise answers to arbitrary questions about the system." Five words. One does admire a tool confident enough to describe itself in fewer words than most error messages.

DTrace works by compiling probe scripts, written in a purpose-built language called D, into safe bytecode that is injected directly into the running kernel. When a probe fires, the bytecode executes in kernel context: no context switch, no process stop, no overhead beyond the probe itself. When a probe is not enabled, the overhead is zero. Not low. Not negligible. Zero. The original machine instruction runs unmodified. The probe point does not exist until you enable it. This is not an optimisation. It is the architecture.

The Language

The D language is deliberately crippled. It cannot loop infinitely. It cannot dereference invalid pointers. It cannot allocate memory. It cannot modify kernel state. It cannot crash the system. Ever.

These are not limitations. They are the entire point. Any valid D programme is guaranteed safe on a production system. Safety is not a runtime check. It is a compile-time invariant. The language was designed so that the question "is this script safe to run in production?" is answered by the compiler, not by a change advisory board.

A DTrace one-liner to count system calls by process on a running FreeBSD server:

dtrace -n 'syscall:::entry { @[execname] = count(); }'

No recompilation. No restart. No risk. The answer appears whilst the system continues serving traffic. You can trace every syscall on a FreeBSD server serving ten thousand requests per second. Every function entry in the kernel. Every TCP handshake. Right now. Without asking anyone's permission. Without filing a ticket. Without scheduling a maintenance window.

The Proof

In 2006, DTrace won the Wall Street Journal's Technology Innovation Award, Gold. Not for a consumer product. Not for an app. For a kernel instrumentation framework that lets you ask a running operating system what it is doing. One does find that rather encouraging about the state of technology journalism, at least in 2006.

The DTrace Lineage Sun Microsystems (2003) Cantrill, Shapiro, Leventhal FreeBSD in base since 2008 macOS since 2007 illumos inherited Linux (eBPF) inspired by DTrace Brendan Gregg wrote the definitive DTrace book. He now writes the definitive eBPF book. The ideas survived by being worth stealing.

FreeBSD integrated DTrace in 2008. It ships in base. No packages to install, no modules to compile. macOS has included DTrace since Leopard (2007): every Mac ships with a kernel-level dynamic tracing framework that most users will never know exists. illumos, the open-source continuation of OpenSolaris, inherited DTrace from its birthplace.

On Linux, DTrace's influence is unmistakable. eBPF (extended Berkeley Packet Filter) and its higher-level interface bpftrace provide similar capabilities: safe bytecode execution in the kernel, dynamic instrumentation, production-safe tracing. Brendan Gregg, who wrote the definitive DTrace book, now writes the definitive eBPF material. The ideas travelled. The architecture was validated by imitation.

The observability industry, Datadog, Grafana, Prometheus, spends billions solving a problem that DTrace solved in 2003. With rather less YAML.

The Philosophy

Observability is not logging. Logging records what the developer anticipated might go wrong. Observability answers questions the developer never thought to ask. The difference is the difference between a searchlight and the sun.

DTrace does not require you to predict your questions in advance. You do not instrument your code for DTrace. You do not add tracing libraries. You do not configure exporters. The probes exist at every function boundary, every syscall, every I/O operation. You simply ask.

This is what makes DTrace beautiful: it treats the running system as something that should be fully transparent to the operator. Not partially visible through pre-configured dashboards. Not indirectly observable through aggregated metrics. Directly, completely, safely observable. In production. Under load. Right now.

The Point

Twenty-three years after its first integration, DTrace remains the standard against which production tracing tools are measured. Its core design has not changed because it did not need to. Zero overhead when disabled. Safe by construction. Concise answers to arbitrary questions.

Bryan Cantrill is now CEO of Oxide Computer Company, building rack-scale computers with the same philosophy: the system should be fully observable, fully debuggable, and fully understood. The principle survived the company that created it. One does find that rather beautiful.

DTrace. Ask the system. It will answer.

Zero overhead when probes are disabled. Safe by construction. Concise answers to arbitrary questions. DTrace solved production observability in 2003. FreeBSD ships it in base. macOS has had it since 2007. Linux built eBPF because the ideas were worth stealing. Logging records what you predicted. DTrace answers what you never thought to ask.