Script Valley
System Design: APIs, Caching & Scalability
Scalability PatternsLesson 3.1

Horizontal vs vertical scaling: when to use each

vertical scaling limits, horizontal scaling, shared nothing architecture, stateless services, scaling trade-offs, cost comparison, cloud elasticity

Horizontal vs vertical scaling: when to use each

Vertical vs horizontal scaling

Two Ways to Handle More Traffic

Vertical scaling (scale up): add more CPU, RAM, or disk to a single machine. Simple — no code changes, no distribution complexity. Hard limit: you will hit the ceiling of the largest available machine. Expensive at the top end and a single point of failure.

Horizontal scaling (scale out): add more machines and distribute load across them. Theoretically unlimited. Requires your application to be stateless — if a request can land on any server, no server-local state such as sessions or in-memory caches can exist.

The Stateless Requirement

Horizontal scaling forces stateless design. Move session state to Redis, file uploads to object storage, and scheduled jobs to a queue:

# Bad: session in server memory
app.use(session({ secret: 'x', resave: false, saveUninitialized: false }));

# Good: session in Redis
const RedisStore = require('connect-redis')(session);
app.use(session({ store: new RedisStore({ client: redis }), secret: 'x' }));

When to Use Each

Start vertical — it is simpler and fast to provision. Switch to horizontal when you hit resource limits, need high availability, or require zero-downtime deployments. Most modern cloud-native architectures use horizontal scaling with auto-scaling groups that add or remove instances based on CPU or request rate metrics.

Up next

How load balancers work: algorithms and types

Sign in to track progress