Three Names, Overlapping Jobs
Reverse proxies, load balancers, and API gateways all sit between clients and backend servers. In practice, their responsibilities overlap significantly. Nginx can act as all three. AWS ALB is both a load balancer and a reverse proxy. Kong is an API gateway built on top of Nginx.
The distinction matters not because these are rigid categories, but because each represents a different primary concern. Understanding what problem each one solves helps you pick the right tool and explain your architecture clearly in a system design interview.
Forward Proxy vs Reverse Proxy
Before diving in, let’s clear up the direction.
A forward proxy sits in front of clients. It intercepts outbound requests from clients and forwards them to the internet. The server never sees the client’s real IP.
A reverse proxy sits in front of servers. It intercepts inbound requests from the internet and forwards them to backend servers. The client never sees the backend server’s real IP.
graph LR
subgraph Forward Proxy
C1[Client] --> FP[Proxy]
FP --> S1[Internet]
end
graph LR
subgraph Reverse Proxy
C2[Internet] --> RP[Proxy]
RP --> S2[Server A]
RP --> S3[Server B]
end
Forward proxies are used for outbound filtering (corporate firewalls, content filtering). For the rest of this post, we focus on reverse proxies and the components that build on them.
Reverse Proxy
A reverse proxy accepts requests on behalf of backend servers and forwards them. The client only ever talks to the proxy.
What it does
- Hides backend topology: Clients see a single endpoint. The number and addresses of backend servers are private.
- TLS termination: Handles certificate management and encryption/decryption so backends don’t have to.
- Compression and caching: Can gzip responses and cache static content to reduce backend load.
- Request routing: Routes requests to different backends based on URL path, headers, or other criteria.
- SSL offloading: Decrypts HTTPS at the proxy and forwards plain HTTP internally.
When you need one
- You have multiple backend services behind a single domain
- You want to terminate TLS in one place
- You want to add caching without modifying application code
- You want to hide your internal infrastructure from the internet
Common tools: Nginx, HAProxy, Caddy, Envoy, Traefik
Load Balancer
A load balancer distributes incoming requests across multiple instances of the same service. Its primary concern is even distribution and health management.
What it does
- Distributes traffic: Spreads requests across backend instances to prevent any single server from being overwhelmed.
- Health checks: Periodically probes backends and removes unhealthy ones from the pool.
- Session affinity (sticky sessions): Can route requests from the same client to the same backend when needed (e.g., for in-memory session state).
- Connection draining: When removing a server, finishes in-flight requests before taking it out of rotation.
Load Balancing Algorithms
| Algorithm | How it works | Best for |
|---|---|---|
| Round Robin | Rotates through backends sequentially | Homogeneous servers, stateless services |
| Weighted Round Robin | Like Round Robin but servers with higher weight get more traffic | Mixed-capacity servers |
| Least Connections | Sends to the server with fewest active connections | Long-lived connections (WebSockets, streaming) |
| IP Hash | Hashes client IP to pick a consistent server | Session affinity without cookies |
| Random | Picks a random backend | Large pools where simplicity matters |
Layer 4 vs Layer 7
Load balancers operate at two different network layers:
Layer 4 (Transport): Makes routing decisions based on TCP/UDP information (IP address, port). Cannot inspect HTTP headers, URLs, or cookies. Faster because it doesn’t parse the application protocol.
Layer 7 (Application): Parses HTTP and can route based on URL path, headers, cookies, or request body. More flexible but adds processing overhead.
graph TD
C[Client Request] --> L4{Layer 4 LB}
L4 -->|Based on IP:port| S1[Server 1]
L4 -->|Based on IP:port| S2[Server 2]
C2[Client Request] --> L7{Layer 7 LB}
L7 -->|/api/*| API[API Servers]
L7 -->|/static/*| Static[Static Servers]
L7 -->|/ws/*| WS[WebSocket Servers]
In system design interviews, default to Layer 7 unless you have a specific reason for Layer 4 (like raw TCP proxying or extreme throughput requirements).
Common tools: AWS ALB/NLB, Nginx, HAProxy, Google Cloud Load Balancing, F5
Tip
Interview tip: When asked “how do you handle millions of requests?”, don’t just say “add a load balancer.” Specify the algorithm (likely Least Connections or Round Robin), whether it’s L4 or L7, and how health checks remove failing instances.
API Gateway
An API gateway is a reverse proxy with application-level intelligence. Its primary concern is managing, securing, and monitoring APIs.
What it does
Everything a reverse proxy does, plus:
- Authentication and authorization: Validates JWT tokens, API keys, or OAuth flows before the request reaches your backend.
- Rate limiting: Enforces per-client or per-endpoint request limits.
- Request/response transformation: Modifies headers, rewrites paths, transforms payloads (e.g., XML to JSON).
- API versioning: Routes
/v1/usersand/v2/usersto different backend services. - Analytics and logging: Tracks API usage, latency, and error rates per consumer.
- Circuit breaking: Stops sending traffic to a failing backend after repeated errors.
API Gateway vs Reverse Proxy
The key difference: an API gateway understands your API contracts. A reverse proxy forwards bytes. An API gateway understands that /users/{id} is a resource, can validate the request body against a schema, enforce rate limits per API key, and transform the response before returning it.
%%{init: {'sequence': {'noteAlign': 'left'}}}%%
sequenceDiagram
participant C as Client
participant GW as API Gateway
participant Auth as Auth Service
participant API as Backend API
C->>GW: GET /api/v2/orders (API key in header)
Note over GW: • Validate API key<br/>• Check rate limit<br/>• Rewrite path: /v2/orders → /orders
GW->>Auth: Verify API key
Auth->>GW: Valid, client_id=acme
GW->>API: GET /orders (internal path)
API->>GW: 200 OK {orders: [...]}
Note over GW: • Add CORS headers<br/>• Log request metrics<br/>• Transform response if needed
GW->>C: 200 OK {orders: [...]}
Common tools: Kong, AWS API Gateway, Apigee, Tyk, KrakenD
Note - Service Mesh vs API Gateway
A service mesh (Istio, Linkerd) handles east-west traffic (service-to-service within your cluster). An API gateway handles north-south traffic (external clients to your services). They complement each other. In interviews, mention the API gateway for external traffic and the service mesh for internal mTLS, retries, and observability.
Comparison
| Concern | Reverse Proxy | Load Balancer | API Gateway |
|---|---|---|---|
| Primary job | Forward and hide | Distribute and health-check | Manage and secure APIs |
| TLS termination | Yes | Yes | Yes |
| Routing | Path/host-based | Algorithm-based | Path + versioning + transformation |
| Health checks | Basic | Core feature | Yes |
| Authentication | No (usually) | No | Yes |
| Rate limiting | Basic | No | Yes |
| Request transformation | No | No | Yes |
| Analytics | Basic access logs | Connection metrics | Full API analytics |
| Layer | 7 | 4 or 7 | 7 |
What to Use When
Just need to hide backends and terminate TLS? Use a reverse proxy (Nginx, Caddy).
Need to distribute traffic across multiple instances of the same service? Use a load balancer. If you’re on AWS, ALB for HTTP or NLB for raw TCP.
Exposing APIs to external consumers with auth, rate limiting, and versioning? Use an API gateway (Kong, AWS API Gateway).
In practice, you’ll use multiple layers together:
graph TD
Client[Client] --> CDN[CDN / Edge]
CDN --> GW[API Gateway]
GW --> LB1[Load Balancer]
GW --> LB2[Load Balancer]
LB1 --> S1[Service A - Instance 1]
LB1 --> S2[Service A - Instance 2]
LB2 --> S3[Service B - Instance 1]
LB2 --> S4[Service B - Instance 2]
The API gateway handles cross-cutting concerns (auth, rate limiting, versioning). Behind it, load balancers distribute traffic to individual service instances.
Important - Interview Pattern
In system design interviews, the standard pattern is: Client -> CDN -> API Gateway/Load Balancer -> Services. When drawing this, briefly mention what each layer does. Don’t spend 5 minutes on load balancing algorithms unless the interviewer specifically asks.
Summary
These three components solve different but overlapping problems:
- Reverse proxy: hides backends, terminates TLS, caches, compresses
- Load balancer: distributes traffic, manages health, drains connections
- API gateway: authenticates, rate-limits, transforms, versions APIs
Modern tools like Nginx, Envoy, and Kong blur these lines. The important thing is understanding which concerns apply to your system and choosing the right tool or combination.