Script Valley
Interview Prep: System Design Rounds
Advanced Distributed Systems ConceptsLesson 6.3

Leader election and consensus in distributed systems

split-brain problem, Raft consensus, Paxos overview, Zookeeper coordination, etcd leader election, fencing tokens, lease-based leader election

Why Leader Election Matters

Many distributed systems need exactly one node to perform a task at a time — primary database, job scheduler, shard coordinator. Without coordination, two nodes may both believe they're leader (split-brain), causing data corruption.

Raft Consensus

Raft is the most readable consensus algorithm. Key mechanics:

  • Nodes are in one of three states: leader, follower, candidate
  • Leader sends heartbeats. If followers don't hear from leader, they start an election.
  • A candidate wins if it gets votes from a majority (quorum)
  • All writes go through the leader, which replicates to followers before acknowledging

Fencing Tokens

# Problem: old leader thinks it's still leader after GC pause
# Solution: monotonically increasing fencing token

# Zookeeper generates: lock_token = 34
# Node acquires leader lock with token 34
# Node pauses for 60s (GC)
# New leader acquires token 35
# Old node resumes, sends write with token 34
# Storage server: reject, current token is 35

Practical Tools

  • Zookeeper: distributed lock, leader election, config store
  • etcd: Kubernetes uses etcd for all cluster state, backed by Raft
  • Redis Redlock: distributed lock using Redis (use with caution — not safe under all failure modes)

Up next

How to design for observability — metrics, logging, and tracing

Sign in to track progress