24 April 2026 Read on LinkedIn

Service Mesh — The Sidecar Tax

kubernetes architecture performance

The Invoice ■ Episode 19

"mTLS, observability, traffic management, zero-code retries. You need a service mesh."

Splendid. Let us examine what one is actually paying for.

A service mesh moves cross-cutting concerns (mTLS, retries, timeouts, traffic shifting, observability) out of application code and into a proxy that sits beside each pod. Istio, the archetype, launched in 2017 as a joint project of Google, IBM, and Lyft. It graduated within the CNCF in July 2023. In the 2024 CNCF Annual Survey, service-mesh adoption across respondents fell to 42 per cent, down from 50 per cent the year before. That is not a catastrophe. It is, however, the first full-year decline the category has ever posted. The industry is quietly reconsidering the deal.

The Complexity Invoice

Istio ships over a dozen primary custom resource definitions across three categories (traffic management, security, telemetry) and dozens more through its operator, telemetry plugins, Wasm extensions, and Gateway APIs. A minimally useful installation comprises:

A control plane (istiod) responsible for configuration distribution, certificate issuance, and xDS API serving to every sidecar
A per-pod sidecar (Envoy) injected into every workload, running a second container alongside the application
An ingress gateway at the cluster edge, usually another Envoy in a standalone pod
mTLS certificates rotated by istiod, distributed via SDS to each sidecar
Policy resources: PeerAuthentication, RequestAuthentication, AuthorizationPolicy
Telemetry bindings to send traces and metrics to external collectors
A platform team that knows what each of those does, how they interact, and how to debug any given failure mode

The CNCF's own reports describe Istio as mature, powerful, and "operationally demanding". The second adjective is the one to watch. Installing Istio in a fresh cluster takes a senior SRE about two days. Operating it for six months takes roughly 0.5 to 1.0 FTE, scaling upwards with cluster size. Debugging it at three in the morning is a skill one acquires by losing two nights of sleep and one customer.

The Latency Invoice

Every inter-service HTTP or gRPC call now traverses two Envoy proxies: the caller's sidecar, then the callee's sidecar. Adding two proxies to every request path means adding latency. How much is now, happily for the debate, well-measured.

A 2025 peer-reviewed performance comparison from the DeepNess Lab (Performance Comparison of Service Mesh Frameworks: the mTLS Test Case) measured the overhead with mTLS enforced on otherwise identical workloads. The table below is, one regrets to say, unambiguous.

The headline number (plus 166 per cent for Istio sidecar with mTLS) is surprising only to people who have never read the benchmark. Envoy is fast; two Envoys in the path plus TLS handshakes and certificate validation are not free. Linkerd's Rust-based linkerd2-proxy is measurably lighter because it was built for the job, not adapted to it. Ambient mode, introduced in Istio 1.23 in August 2024, replaces per-pod sidecars with a shared node-level ztunnel and produces dramatically less overhead. Ambient is, in polite summary, Istio's own public admission that the sidecar model had a problem it could not solve by optimisation alone.

A sidecar also costs memory. The Istio 1.24 performance documentation reports approximately 60 MB of RAM and 0.20 vCPU per Envoy sidecar at 1,000 HTTP RPS with 1 KB payloads. A cluster with 1,000 pods is therefore paying roughly 60 GB of RAM and 200 vCPU for the mesh before a single byte of application code has executed. Ambient ztunnels are smaller (approximately 12 MB RAM, 0.06 vCPU each) but one now also pays for waypoint proxies where L7 features are enabled. Either way, the total is non-zero. "Free" is a marketing word.

The Debugging Invoice

When the mesh works, it is invisible. When it does not, the request path has doubled and so has the attack surface for bugs. A 500 that arrives at the client might originate in:

The application code itself
The caller's Envoy (wrong upstream cluster, circuit breaker tripped)
The destination's Envoy (connection limits, bad cert rotation)
A mis-parsed VirtualService or DestinationRule
The mTLS trust chain (expired intermediate, wrong trust domain)
istiod failing to push updated configuration within the retry window
A Wasm plugin throwing an exception
A Kubernetes NetworkPolicy quietly dropping the packet

The distributed tracing one installed to understand the mesh is now required to understand the mesh. Troubleshooting skills become mesh-specific skills, which means they do not transfer and do not scale with engineer headcount in the obvious way.

The Honest Case For

In the interests of not selling a one-sided story: service meshes solve a real problem for a real set of operators. If one:

Runs more than roughly 100 microservices with cross-team ownership
Has strict compliance that mandates mTLS between every internal service
Operates across multiple clusters or multiple clouds with incompatible primitives
Needs uniform observability across polyglot services that cannot all ship an OpenTelemetry library

then the tax starts to pay for itself. Everyone else, which is most readers, is paying Google's architecture to solve problems a single load balancer and a sensible VPC already solved.

The Alternative

Direct HTTP or gRPC calls between services, over a network one already trusts. This is how the internet worked for three decades before sidecars existed. It was, one should note, a perfectly functional three decades.

mTLS terminated at a single ingress gateway (HAProxy, nginx, Envoy on its own, or whatever load balancer is already in the stack), because the VPC was a trust boundary before sidecars were a marketing category. Internal traffic over plaintext inside the VPC is fine for the vast majority of workloads, and mTLS between services is a compliance requirement for a minority of them, not an architectural necessity for all of them.

Tracing and metrics via an OpenTelemetry library linked into each service. OTel is language-agnostic, vendor-neutral, and roughly five lines of initialisation in most runtimes. It sends traces and metrics via OTLP to any collector. No proxy required.

Retries and timeouts in the client library. Go's http.Client, Rust's reqwest, Java's RestTemplate or OkHttp, Python's httpx, Node's undici: all of them ship configurable retries, timeouts, connection pools, and circuit breakers. The retry logic that a service mesh claims to provide "without code changes" is three lines of configuration in any mature client, and has been so since approximately 1995.

Authorisation at the application layer, because only the application knows what "this user may read this document" means. Delegating authorisation to a proxy is delegating it to a component that does not, on any reasonable reading, understand the data.

The Pattern

Service mesh is sold as "zero code changes". One gets that by paying:

Two proxies of latency on every internal call, measurably more under mTLS
A platform team of overhead to run istiod, gateways, policies, and upgrades
A debugger's worth of new moving parts: VirtualService, DestinationRule, PeerAuthentication, Envoy configuration, trust chains, Wasm plugins

All to avoid writing retry logic that any mature HTTP client already provides in three lines of configuration.

The mesh was always, architecturally, a political solution to a technical problem. It existed because microservice teams did not trust each other's code, and a proxy in the middle was a way of enforcing cross-cutting concerns without convincing any one team to adopt them. The proxy became the architecture. The architecture became the operational cost centre. The cost centre produced Ambient Mode, which is the industry's second try at making sidecars not cost what sidecars cost.

Meanwhile, the original alternative (a library in each service, a trusted network below, and a single ingress gateway at the edge) has remained exactly what it has been since approximately 1995.

The side call was always there. One simply decided it wasn't enterprise enough.

Istio graduated CNCF July 2023. CNCF 2024 Survey: mesh adoption 42%, down from 50%. 2025 peer-reviewed benchmark: Istio sidecar +166% mTLS latency, Cilium +99%, Linkerd +33%, Istio Ambient +8%. 60 MB RAM per sidecar = 60 GB across 1,000 pods before code runs. Ambient is Istio's own admission that sidecars were a problem. The client retry library has shipped retries since the 1980s.

The sidecar tax, itemised:

Adoption: Istio launched 2017, CNCF graduation July 2023. CNCF 2024 Survey: service-mesh adoption 42% (down from 50% in 2023). First full-year decline
Latency (DeepNess Lab, 2025, mTLS enforced): Istio sidecar +166%, Cilium +99%, Linkerd +33%, Istio Ambient +8%
Memory (Istio 1.24 docs): sidecar ~60 MB RAM + 0.20 vCPU per pod at 1,000 RPS. Ambient ztunnel ~12 MB RAM + 0.06 vCPU per node
Complexity: a dozen-plus primary CRDs + Wasm extensions + Gateway APIs. Install: 2 days. Operate: 0.5-1.0 FTE. Debug at 3am: a career
Ambient (Istio 1.23, Aug 2024): node-level ztunnel + waypoint proxies, replacing per-pod sidecars. Second attempt at the same problem
Alternatives: direct HTTP/gRPC over a trusted VPC; mTLS at one ingress gateway; OpenTelemetry library; retries in any modern HTTP client (Go http.Client, Rust reqwest, Java OkHttp, Python httpx, Node undici)
Series context: Invoice Ep01 (Microservices) demanded the mesh; Ep02 (Kubernetes) demanded the platform team that runs it