The Invoice ■ Episode 15
"Query exactly what you need! One endpoint! No over-fetching!"
Splendid. Let us examine what you are actually paying for.
In 2012, Facebook had a rather specific problem. Their iOS News Feed consumed data from hundreds of internal microservices over mobile bandwidth that was, at the time, neither fast nor cheap. The existing REST endpoints returned too much data, too many of them needed to be called in sequence, and the mobile team was spending more time choreographing API calls than building features. So Lee Byron, Dan Schafer, and Nick Schrock built GraphQL to solve it.
The solution was brilliant. For Facebook. You have 12 REST endpoints
and a fetch() call. But do carry on.
The N+1 Invoice
Consider a modest query: fetch 25 users with their posts. In REST,
this is two calls. GET /users returns the list.
GET /users/:id/posts returns the posts. Two requests,
two responses, two cache entries. Straightforward.
In GraphQL, you write one query. Elegant. The resolver for users
fires once and returns 25 rows. Then the resolver for posts
fires once per user. That is 26 database queries for one API call.
This is the
N+1 problem,
and it is not a bug. It is how resolvers work by design.
The fix exists. It is called DataLoader, a batching utility that collects individual resolver calls and combines them into bulk queries. With DataLoader, those 26 queries become 2. Splendid. But DataLoader is not built into GraphQL. It is not part of the specification. It is not enabled by default. It is a separate library that you must install, configure, and integrate into every resolver that touches a database. It is, in the most generous interpretation, homework.
Without it, a relation-heavy GraphQL endpoint performs measurably worse than the REST equivalent it was meant to replace. REST delivers nearly half the latency and 70 per cent more requests per second for standard CRUD operations. The query language that promised efficiency costs you efficiency.
The Caching Invoice
REST uses HTTP caching. ETags, Cache-Control, CDN
layers. GET requests to unique URLs, cacheable by design. The
browser caches them. The CDN caches them. The reverse proxy
caches them. Three layers of caching infrastructure, built into
the web itself, requiring precisely zero configuration from you.
GraphQL uses POST to a single endpoint. Every query, every
mutation, one URL: /graphql. HTTP caching does not
work because every request is a POST to the same address with a
different body. The browser cannot cache it. The CDN cannot cache
it. The reverse proxy cannot cache it. HTTP, the most
battle-tested caching infrastructure in the history of computing,
has been demoted to a dumb tunnel.
The replacement is Apollo's normalised cache, persisted queries, or custom caching layers that you build and maintain yourself. 56 per cent of teams report caching challenges with GraphQL. One rather suspects the other 44 per cent have not noticed yet.
And Apollo Client, the library that provides this custom caching,
weighs
43 KB gzipped.
fetch() ships with every browser at 0 KB. You are
paying 43 kilobytes to restore functionality that HTTP provided
for free before you broke it.
The Security Invoice
A 128-byte nested query can consume 10 seconds of CPU time. No authentication required. The query is syntactically valid. The schema permits it. The server dutifully executes it.
{
users {
posts {
comments {
author {
posts {
comments {
author { name }
}
}
}
}
}
}
}
This is a valid GraphQL query. It is also a denial-of-service
attack. The recursive relationship between users, posts, comments,
and authors creates exponential depth that REST never exposes,
because REST endpoints are flat by design. You cannot accidentally
nest GET /users six levels deep.
80 per cent of GraphQL APIs are vulnerable to denial-of-service through query depth. Most frameworks ship with no default depth limit. You must build depth limiting, cost analysis, and complexity-based rate limiting. Traditional rate limiting by endpoint does not work when every request hits the same URL. The OWASP GraphQL Cheat Sheet reads like a confession of design decisions that were never made.
The Monitoring Invoice
GraphQL returns HTTP 200. Always. Even when your application is on fire.
Errors live in a JSON array inside the response body. Your monitoring dashboard shows 100 per cent success rate. Your users see failures. Your on-call engineer sleeps through the night. Your customers do not. The HTTP status code, that universal language of success and failure that every tool in the ecosystem understands, has been reduced to a decorative constant.
The Alternative
REST with
OpenAPI 3.0:
self-documenting, typed client generation, HTTP caching built in.
fetch() ships with every browser at 0 KB. The
specification is stable. The tooling is mature. The caching works.
The status codes mean what they say.
83 per cent
of web services use REST. Not because they have not heard of
GraphQL, but because they evaluated the trade-off and chose
accordingly.
GraphQL solves a real problem: aggregating hundreds of services behind a single query interface. If you have that problem, use it. If you have hundreds of microservices, a mobile client on constrained bandwidth, and a frontend team that needs to iterate independently of the backend, GraphQL is genuinely excellent. Facebook built it for exactly this scenario, and it works beautifully there.
If you do not have that problem, you are paying the invoice for someone else's architecture.
The Pattern
Facebook built GraphQL for a mobile feed consuming hundreds of
services over constrained bandwidth. In
2015,
they open-sourced it. The industry adopted it without the problem.
A dashboard with 12 endpoints gained a query language, a schema
layer, a type system, a normalised cache, a depth limiter, a
cost analyser, and 43 KB of client library. The original
fetch() call is still there, underneath it all,
wondering what it did wrong.
The query language is excellent. The question is whether you have queries worth asking.
Facebook built GraphQL for hundreds of microservices over mobile bandwidth. You copied it for a dashboard with 12 endpoints. N+1 queries by design. HTTP caching broken by design. 200 on errors by design. 80 per cent of APIs vulnerable to depth-based DoS. The query language is excellent. The question is whether you have queries worth asking.