Scalability and Load BalancingLesson 2.1

Horizontal vs vertical scaling - which to choose and when

vertical scaling limits, horizontal scaling, stateless vs stateful services, auto-scaling, cost trade-offs, single point of failure

Two Ways to Scale

When traffic grows, you have two options: make one machine bigger (vertical) or add more machines (horizontal). Both have hard limits.

Vertical Scaling

Upgrade CPU, RAM, or disk on a single server. Simple - no code changes required. But it has a hard ceiling: the largest available instance type. It also creates a single point of failure. If that machine dies, everything dies.

Horizontal Scaling

Add more servers and distribute traffic between them. No theoretical ceiling. Requires your application to be stateless - each server must handle any request without needing data from another server's memory.

Making Services Stateless

Move session state out of application servers:

// Bad: session stored in memory (breaks with multiple servers)
app.use(session({ store: new MemoryStore() }))

// Good: session stored in Redis (shared across servers)
app.use(session({ store: new RedisStore({ client: redisClient }) }))

When to Use Each

Vertical: databases (scaling out is hard), early-stage products, simple ops requirements
Horizontal: application servers, stateless microservices, anything needing high availability

In interviews, default to horizontal scaling for application tier and vertical-first for databases, then add read replicas.