Scalability and Load BalancingLesson 2.2
How load balancers work and which algorithm to use
round robin, least connections, IP hash, layer 4 vs layer 7 load balancing, health checks, sticky sessions
What a Load Balancer Does
A load balancer sits between clients and your servers, distributing incoming requests to prevent any single server from becoming a bottleneck. It also detects unhealthy servers and stops routing traffic to them.
Load Balancing Algorithms
- Round Robin: requests go to servers in rotation (1→2→3→1→2→3). Simple, works when servers have equal capacity.
- Least Connections: next request goes to the server with fewest active connections. Better when requests have variable processing time.
- IP Hash: client IP deterministically maps to a server. Useful for sticky sessions, but fails if a server goes down.
- Weighted Round Robin: heavier servers get more traffic. Use when server capacities differ.
Layer 4 vs Layer 7
- L4 (TCP): routes based on IP and port. Fast, no content inspection.
- L7 (HTTP): routes based on URL, headers, or cookies. Enables path-based routing (e.g., /api → service A, /static → CDN).
Health Checks
Load balancers ping servers at a set interval. A server that fails N consecutive checks is removed from rotation. Always configure health checks with a path that actually exercises your app, not just a TCP ping.
