What is caching and why does it matter for performance
cache definition, cache hit vs miss, cache hit ratio, latency reduction, origin offloading, cost reduction, when not to cache
What is caching and why does it matter for performance
The Core Concept
A cache is a fast, temporary storage layer that serves repeated requests without hitting the slow origin. The origin might be a database, an external API, or a computation-heavy function. Caching trades storage for speed.
Cache Hit vs Cache Miss
A cache hit occurs when requested data is in the cache โ response time is microseconds. A cache miss occurs when it is not โ the system fetches from origin, stores the result, then responds. Your goal is maximizing the hit ratio.
# Redis cache-aside example
const cached = await redis.get(`user:${id}`);
if (cached) return JSON.parse(cached);
const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
await redis.setex(`user:${id}`, 3600, JSON.stringify(user));
return user;When Not to Cache
Caching adds complexity: stale data, invalidation bugs, and cold-start problems. Avoid caching when data changes on every request, when accuracy is critical (financial balances, inventory counts), or when the fetch is already fast. Cache read-heavy, slow-to-compute, or expensive-to-fetch data. The 80/20 rule applies: 20% of your data is requested 80% of the time โ those are your cache candidates.
