14 January 2026 Read on LinkedIn

Agile Estimation Theatre

agile architecture methodology

The Copy Cat ■ Episode 03

Late 1990s. Ron Jeffries and Kent Beck are estimating user stories in Extreme Programming. They use "Ideal Days": how long would this take if nobody interrupted, no meetings happened, and the laws of thermodynamics briefly paused?

Managers kept asking why an "ideal day" took three real days. Rather persistently.

The solution was elegant in the way that avoidance often is: stop calling them days. Call them "points." Nobody asks awkward questions about a "point." What is a point? It is a point. How long is a point? It is not time. It is relative size. Points compared to other points. A 5-point story is roughly twice a 3-point story. No hours. No calendars. Just sorting.

Ron Jeffries, 2019: "I may have invented story points, and if I did, I'm sorry now."

The Original

Story points, in their original form, were relative sizes. Not measurements. Not commitments. Certainly not deadlines. The idea was disarmingly simple: compare stories to each other. "This one feels roughly twice as big as that one." The precision was deliberate imprecision. "About the same size" was sufficient. The unit was meaningless by design, because any unit with meaning would be converted into time, and time would be converted into a commitment, and a commitment would be converted into a deadline, and a deadline would be missed, and a meeting would be called.

Sorting. Not measuring. Quite sensible, really.

The Copy

By the mid-2000s, the industry had discovered velocity: sum the points completed per sprint. Plot a graph. Show it to management. The unit that was not time had become time. Marvellous.

The context that got lost:

Original: points compared stories within a single team. Copy: points compare teams across the organisation.

Original: velocity was a private planning tool for the team. Copy: velocity is a KPI reported to stakeholders quarterly.

Original: "about the same size" was sufficient precision. Copy: 45-minute debates whether something is a 3 or a 5. A terribly productive use of six engineers.

The Inflation

A team's velocity rises from 30 to 50 over six months. Management celebrates. The team has changed nothing except how they estimate. What was a 3 in January is a 5 in July. Same task. Same effort. Higher number. Everybody claps.

The system sustains itself because honesty would be more expensive than the fiction. Management gets rising graphs. The team gets left alone. The Scrum Master gets a purpose. Nobody will peck out the other's eye when all benefit from keeping it shut.

The Five-Year Plan

There is a historical precedent. The German Democratic Republic had five-year plans. Every factory exceeded its targets. Every quarter, the numbers improved. The economy collapsed anyway. The numbers were not measurements. They were performances.

Story points are the five-year plan of software development. Everyone knows the numbers are performative. Everyone reports them. Nobody asks whether the software shipped faster. The ritual sustains itself because the alternative, admitting that estimation at this granularity is guesswork, would render a great deal of expensive ceremony unnecessary. And ceremony employs people.

The Alternative That Already Works

Ron Jeffries now suggests simply counting stories. No points. No poker. No theatre. How many items did the team deliver last sprint? Forecast from the historical rate. This is the #NoEstimates position (Woody Zuill, Vasco Duarte, 2012): throughput-based forecasting. Count items delivered. Plot the rate. Project forward. Equivalent predictability. A fraction of the ceremony.

The objection, inevitably, is that stories vary in size. They do. But over a sufficiently large sample, the variance averages out. The same statistical principle that makes velocity "work" makes throughput work, without the 45-minute debates and without the inflation incentive.

Rather less entertaining at standup. But rather more honest.

A unit invented to avoid gaming became the most gamed metric in software. The inventor apologised. The industry carried on. One almost admires the irony.

The timeline. Late 1990s: "Ideal Days" in Extreme Programming (Beck, Jeffries). Renamed to "Points" to stop managers converting them to calendar days. Mid-2000s: velocity tracking becomes standard. ~2010: velocity becomes a management KPI. 2019: Jeffries publicly regrets inventing story points. Planning Poker consumes 1 to 4 hours per sprint, up to 10% of development time spent debating whether something is a 3 or a 5. The pitfall is not the tool. It is the quiet, collective agreement to pretend the numbers mean something they do not. Goodhart's Law, dressed in a Fibonacci sequence.