Caching SystemsLesson 4.3

How to handle cache invalidation in distributed systems

cache invalidation strategies, TTL expiry, event-driven invalidation, write-through invalidation, stampede problem, cache warming

The Hardest Problem in Caching

Phil Karlton famously said there are only two hard things in computer science: cache invalidation and naming things. The problem is real: when your source of truth updates, stale data in the cache must be removed or refreshed.

TTL-Based Invalidation

Simplest approach. Set a TTL on every cached key. Accept that data can be stale for up to TTL seconds. Works well for content that can tolerate short staleness windows (product prices, catalog data).

Event-Driven Invalidation

When data changes in the DB, publish an invalidation event. Cache consumers listen and delete the affected key.

# On DB write:
db.update(user_id=123, name='Alice')
event_bus.publish('user.updated', { 'user_id': 123 })

# Cache invalidation consumer:
def on_user_updated(event):
    cache.delete(f'user:{event["user_id"]}')

Cache Stampede

When a popular key expires, many concurrent requests all miss the cache simultaneously and hit the DB at once. Solutions:

Mutex/lock: first request refreshes cache; others wait
Probabilistic early expiration: randomly refresh slightly before TTL expires
Background refresh: a separate job pre-warms the cache before expiry