API Gateway¶

One front door. Many backends. Cross-cutting concerns handled once.

The hook¶

You've got 50 microservices. Each one needs auth. Each one needs rate limiting. Each one needs logging, request transformation, versioning, CORS.

You can re-implement that in 50 places — 50 chances to get it wrong, 50 places to update when the auth library has a CVE. Or you can put one box in front of them all that handles the cross-cutting concerns.

That box is an API gateway. It's the bouncer at the door. Auth, rate limit, routing — done once at the entrance, not 50 times inside.

The concept¶

An API gateway is a reverse proxy with cross-cutting features baked in. It sits at the edge between clients and your service fleet. Every request flows through it.

Standard responsibilities:

Authentication — validate JWT, API key, OAuth token before anything reaches your services
Authorization — check scopes and roles ("does this token have orders:write?")
Rate limiting — per-client and per-endpoint quotas, so one bad actor can't take you down
Routing — path-to-service mapping, blue/green deploys, canary traffic splitting
Transformation — REST to gRPC, version translation (v1 clients hitting v2 services)
Observability — centralized logging, metrics, distributed tracing headers
Caching — cache responses for hot read endpoints, take pressure off the backend

The pattern: anything that every service would otherwise have to do, the gateway does once.

Diagram¶

flowchart LR
    C1[Web Client] --> GW[API Gateway]
    C2[Mobile Client] --> GW
    C3[Partner API] --> GW
    GW -->|auth + rate limit + route| S1[Auth Service]
    GW --> S2[Orders Service]
    GW --> S3[Payments Service]
    GW --> S4[Inventory Service]
    GW --> S5[Shipping Service]
    GW -.logs + metrics.-> O[Observability Stack]
    style GW fill:#2d3,stroke:#0a0,color:#000

The gateway is the single front door. Clients never know about internal service topology. Services can move, split, merge, get renamed — clients keep hitting the same gateway URL.

Example — Netflix Zuul¶

Netflix runs Zuul as their API gateway. We already met it in Load Balancers — this is the canonical example. Zuul handles roughly 125 billion requests per day across 1,000+ microservices behind it.

Why Netflix built Zuul instead of using a generic LB

A plain L7 load balancer routes by URL. Netflix needed more:

Dynamic routing rules — push a config change and traffic reroutes in seconds, no redeploy
A/B test traffic splitting — send 5% of users to the new recommendations service
Request authentication — check device tokens before any internal service sees the request
Resilience patterns — retries, timeouts, circuit breakers when a backend gets sick

A generic LB doesn't do any of that. Zuul does all of it as filters that run on every request.

What the gateway does on a typical request

Request hits Zuul from the public internet (after the edge LB)
Auth filter — validate the device/user token, reject if invalid
Rate limit filter — check this client's quota for this endpoint
Route lookup — URL path → service ID via the routing table (e.g., /api/playback/* → playback-service)
Forward — proxy the request to a healthy instance of that service
Attach observability headers — trace ID, request ID, so logs from every downstream service stitch together
Return response — possibly transformed (e.g., strip internal headers)

The trade-off

The gateway is itself a single point of failure. Every request flows through it. If Zuul goes down, Netflix is down — even if every backing service is healthy.

Netflix mitigates with multi-AZ deployment, auto-scaling, and aggressive health checks. The gateway tier runs hot — over-provisioned by design — because the alternative is the whole product going dark.

Other gateways worth knowing

Tool	What it is	When you'd reach for it
Kong	Open-source gateway built on NGINX, plugin ecosystem	Self-hosted, on-prem, or you want to own the layer
AWS API Gateway	Managed gateway, integrates with Lambda and IAM	You're on AWS and don't want to run your own
Cloudflare Workers	Edge gateway, runs at 300+ POPs worldwide	You want auth and routing at the network edge, not in your region
Zuul / Spring Cloud Gateway	JVM-based, Netflix-style	Java shop, want fine-grained filter control

Mechanics — what the gateway actually does¶

Concern	What the gateway does	Why centralize it
Authentication	Validates JWT/API key/OAuth token, rejects unsigned requests	One library, one rotation cadence, one CVE patch — not 50
Authorization	Checks scopes/roles against the route's policy	Policy lives next to the route, not buried in service code
Rate limiting	Counts requests per client/endpoint, returns 429 over limit	A single counter is consistent; per-service counters drift
Routing	Maps URL paths to service IDs, supports canary and blue/green	Change traffic flow without redeploying services
Transformation	Translates protocols (REST ↔ gRPC), rewrites headers, strips internals	Public contract stays stable while internals churn
Observability	Injects trace IDs, emits unified access logs and metrics	One source of truth for "what hit our system today"
Caching	Caches responses for hot read endpoints	Backend stays cool; cache hit rate is visible in one place
Gateway vs service mesh	Gateway = north-south traffic (client → service). Mesh = east-west (service → service)	Different problems, different tools. Most large systems run both.

The gateway-vs-mesh row is the one most people get wrong. They're not competitors. The gateway is your perimeter. The mesh is your interior. Same idea (push cross-cutting concerns into infrastructure), different blast radius.

Concept	What it is	How it relates
Load Balancer	Distributes traffic across backend servers	A gateway sits behind one (or has one built in). Gateway adds smarts a plain LB doesn't have.
Reverse Proxy	Server that forwards client requests to backends	A gateway is a reverse proxy with cross-cutting features layered on. NGINX is the line between the two.
Service Mesh	Sidecar-based traffic management between services (Istio, Linkerd)	Handles internal east-west traffic. Complementary to the gateway, not competing.
OAuth / JWT	Token-based auth standards	Gateway typically validates the token so services don't have to.
Rate Limiting	Cap on how often a client can call an endpoint	One of the most common gateway features. Token bucket or sliding window counters live in the gateway.
BFF (Backend for Frontend)	A gateway-like layer dedicated to one client type (web, iOS, Android)	When the generic gateway isn't shaped right for each client, you build a BFF per client. Gateway is shared; BFF is purpose-built.
CORS	Browser policy controlling cross-origin requests	Gateway is where you set CORS headers consistently across every endpoint.
Microservices	Architecture of small, independently deployed services	The architecture that makes a gateway worth building. Monoliths don't need one.

When (and when not) to use it¶

Use an API gateway when:

You have 5+ microservices and want one place for cross-cutting concerns
You expose public APIs and need a stable contract while internals churn
Auth, rate limit, or routing logic is being duplicated across services — that's the smell
You need canary or blue/green deploys and want to shift traffic without redeploying services
You're running multiple client types (web, mobile, partner) and they need different shapes of the same data

Skip it when:

Monolith or 1–2 services — the gateway is overhead with no payoff. Add it when you split.
You're on a managed platform that already gives you the features — Vercel, Cloudflare, AWS App Runner, Fly.io often handle auth, rate limiting, and TLS at the edge. Don't add a second layer.
Internal-only tools with trusted clients and no rate-limit needs — direct calls are fine.
You haven't actually felt the pain yet. Adding a gateway "just in case" buys complexity you'll pay for in latency and ops.

The default answer for any system with serious microservices is yes, run a gateway. The interesting question is which one — managed (AWS, Cloudflare) for speed, self-hosted (Kong, Zuul) for control.

Key takeaway¶

API gateway centralizes cross-cutting concerns. Auth, rate limit, routing, observability — done once at the door.
Service mesh handles east-west. Gateway and mesh complement, not compete.
The gateway is a single point of failure. Run it multi-AZ with auto-scaling, or you've traded service outages for one big outage.
Skip it for monoliths or 1–2 services on a managed platform. Add it when you feel the duplication pain — usually around 5 services in.
Public APIs benefit most. Stable external contract while internals are free to change.

Quiz available in the SLAM OG app — three questions on what a gateway centralizes, gateway vs mesh, and when not to bother.