Advanced Distributed Systems ConceptsLesson 6.3

Leader election and consensus in distributed systems

split-brain problem, Raft consensus, Paxos overview, Zookeeper coordination, etcd leader election, fencing tokens, lease-based leader election

Why Leader Election Matters

Many distributed systems need exactly one node to perform a task at a time - primary database, job scheduler, shard coordinator. Without coordination, two nodes may both believe they're leader (split-brain), causing data corruption.

Raft Consensus

Raft is the most readable consensus algorithm. Key mechanics:

Nodes are in one of three states: leader, follower, candidate
Leader sends heartbeats. If followers don't hear from leader, they start an election.
A candidate wins if it gets votes from a majority (quorum)
All writes go through the leader, which replicates to followers before acknowledging

Fencing Tokens

# Problem: old leader thinks it's still leader after GC pause
# Solution: monotonically increasing fencing token

# Zookeeper generates: lock_token = 34
# Node acquires leader lock with token 34
# Node pauses for 60s (GC)
# New leader acquires token 35
# Old node resumes, sends write with token 34
# Storage server: reject, current token is 35

Practical Tools

Zookeeper: distributed lock, leader election, config store
etcd: Kubernetes uses etcd for all cluster state, backed by Raft
Redis Redlock: distributed lock using Redis (use with caution - not safe under all failure modes)