26 April 2026 Read on LinkedIn

Why We Measure Tickets, Not Problems Prevented

architecture tooling

On Second Thought ■ Episode 05

The dashboard is green. Velocity is up. Burndown is on track. The demo on Friday will be smooth. Production has been quietly fragile for eleven weeks, and nobody notices, because fragility does not have a column.

This is the post about that column.

The Axiom

Productivity, for the purposes of any reporting line above the work itself, is what one can count. Tickets closed, story points completed, lines shipped, deploys per week, sprints reported as "successful", incident-mean-time-to- resolution charted in a quarterly review.

The work that does not produce a number does not exist. The thinking that prevented the incident in the first place does not appear. The decision not to deploy on Friday afternoon does not log. The conversation in which a senior engineer talked the team out of a doomed approach is not in any system of record.

This is not because the people running the dashboards are foolish. It is because the dashboard is the only thing they were given to look at, and over time, the thing one looks at becomes the thing one believes is real.

The Origin

Frederick Winslow Taylor published The Principles of Scientific Management in 1911 with rather industrious enthusiasm. The stopwatch, the time study, the one-best-way. Taylor's intent was to bring rigour to factory work; his unintended legacy was to make the act of being measured the new floor of working life.

Workers struck. The Watertown Arsenal foundry walked out in the summer of 1911 over the introduction of the stopwatch. The US Congress investigated and banned time studies and pay premiums tied to them on US Government work. Taylor's specific instrument was, briefly, defeated.

The instinct survived under a procession of new names: scientific management, then efficiency, then management-by-objectives, then Six Sigma, then Lean, then agile, now velocity. The vocabulary moves on every fifteen years; the underlying premise (that what cannot be counted does not count) moves not at all. One does notice the pattern.

In 1975, the British economist Charles Goodhart wrote, in a footnote of a paper on UK monetary policy: "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." Twenty-two years later, the anthropologist Marilyn Strathern, observing British university assessment regimes, condensed it into the version everyone now quotes: when a measure becomes a target, it ceases to be a good measure.

We were warned by name, twice in one century, by people whose entire professional life had been spent watching the phenomenon. The industry that calls itself data-driven did not, in this case, read the data.

In 2019, Ron Jeffries, one of the original signatories of the Agile Manifesto and the man widely credited with promoting the story point, published a public reconsideration:

"I may have made the name-changing suggestion. If I did, I'm sorry now."

He went on to recommend abandoning story-point estimation entirely. The industry, having found story points terribly useful for promotion decisions, performance reviews, and quarterly board reports, kept them. The dashboard, rather firmly, demands them.

The Cost

Consider two engineers in the same team for the same quarter.

The first engineer prevents three outages. She does this by refusing to deploy on Friday afternoon when the staging environment is showing intermittent failures; by patiently explaining to a junior why the proposed cache invalidation strategy will produce a thundering herd; by spotting, in a routine code review, the off-by-one in the rate limiter that would have melted production under the next traffic spike. None of this work produces a ticket. None of it closes a backlog item. None of it is visible to her manager's manager.

The second engineer cheerfully closes forty-seven tickets that quarter. He is praised in the sprint review. He ships the architecture that produces the outages the first engineer prevented. The outages are then opened as new tickets, which the team will close in subsequent sprints, generating velocity and a sense of forward motion.

The first engineer is invisible to every metric in the building. The second is promoted.

This is not a hypothetical. It is the architecture of the modern software organisation, applied with the consistency of a religious practice. The dashboard goes up. The system goes down. The dashboard goes up again, because the new outages are recorded as feature requests in next quarter's backlog, and closing them counts as work.

The cost is not only the outages. The cost is that the first engineer, finding her judgement systematically unrewarded, eventually leaves. The second engineer, finding his ticket-throughput systematically rewarded, eventually becomes director of engineering. The system optimises the people the way one would optimise a queue, and the queue knows nothing about the building it is keeping standing.

The Made-in-Germany Inversion

A short historical detour, because the contrast is precise and the contrast is the point.

"Made in Germany" was an insult before it was a compliment. The British Merchandise Marks Act of 1887 was passed by Parliament after British manufacturers (Sheffield cutlery in particular) complained that German imitations were entering the country with British-style markings. The Act required all foreign goods to be plainly marked with their country of origin, the practical purpose being to allow British consumers to recognise and refuse the inferior German imports.

The plan worked exactly as designed for about a decade. Then it inverted.

Within thirty years, "Made in Germany" had become a guarantee of value. German firms had used the label not to perform productivity but to ship goods that did not need replacing. Solingen blades that lasted decades. Carl Zeiss optics that astronomers in Britain quietly preferred to anything domestic. Steinway pianos. Leica cameras. The label became a guarantee because the work was a guarantee.

The inversion was not a marketing campaign. It was a consequence of a culture that measured the chair, not the hours.

We have, rather industriously, built the inverse industry. The label is impeccable. The dashboards are green. The retros are constructive. The substance, increasingly, is not. A modern enterprise software product carries a thousand certifications, three SOC-2 audits, an ISO-27001 stamp, an SBOM, an OpenSSF Scorecard, and falls over when a region drains. The label does the work the work used to do.

This is not specifically a German point. It is a craftsmanship point that happens to have a useful German example. The craftsmen of Sheffield, who triggered the 1887 Act, would have understood it perfectly well had the inversion gone the other way.

The Question

If a craftsman built a chair to last fifty years, the metric was the chair, not the hours. If a Bell Labs engineer in 1955 designed a Number 5 Crossbar switch that ran in the field for thirty years, the metric was the switch, not the patches. If the engineers at Volvo who designed the three-point seatbelt in 1959 had been measured on patents-filed-per-quarter, they would not have given it away to every other manufacturer in the world, and the next sixty years of road safety would have run rather differently.

What would happen to a software organisation that measured problems prevented rather than tickets closed? That measured quality held rather than features shipped? That measured judgement applied rather than activity logged?

The honest answer is that nobody knows, because nobody has been allowed to try long enough for the chair to last fifty years. Software organisations rarely outlive their last quarterly review by more than two or three of them. The promotion cycle is faster than the chair.

There is a quieter question underneath, which is the one this episode is really about. It is not "what metric should we use instead?" It is: why have we agreed, as an industry of nominally clever people, to organise our working lives around an instrument we know to be wrong, in a way that has been documented for fifty years and apologised for by its inventors?

One does suspect the answer is uncomfortable. It is much easier to ship a number than to ship a thing that lasts. It is much easier to manage a number than to manage a person who is doing something difficult. It is much easier to write a quarterly review of a number than of a judgement.

The dashboard is green. The chair, somewhere, is not being built.

Taylor 1911: stopwatch and time study. Workers struck; Congress banned it on US Government work. Goodhart 1975 and Strathern 1997: when a measure becomes a target, it ceases to be a good measure. Ron Jeffries 2019, on the story point he popularised: "I'm sorry now." The industry kept it. The dashboard, rather firmly, demands it.

The hundred-and-ten-year arc:

1911: Frederick Winslow Taylor publishes The Principles of Scientific Management. Watertown Arsenal foundry strikes; US Congress bans time studies on government work
1975: Charles Goodhart, footnote in a UK monetary policy paper: any statistical regularity collapses once pressure is placed on it for control purposes
1997: Marilyn Strathern's restatement (the version everyone quotes): when a measure becomes a target, it ceases to be a good measure
2019: Ron Jeffries publishes Story Points Revisited apologising for the term he popularised; recommends abandoning it. Industry keeps it
The Made-in-Germany inversion: 1887 Merchandise Marks Act stamped the label on as a stigma; within ~30 years it inverted into a guarantee. Solingen, Carl Zeiss, Steinway, Leica. Measured the chair, not the hours
The cost: the engineer who prevents three outages closes zero tickets and is invisible. The engineer who closes forty-seven and ships the architecture causing the outages is promoted
The question: not "what metric instead?", but why we organise around an instrument documented as wrong for fifty years and apologised for by its inventors