Databases and Storage SystemsLesson 3.2

Database replication — primary-replica and multi-master patterns

synchronous vs asynchronous replication, read replicas, replication lag, multi-master conflicts, failover, write scalability

Why Replicate

Replication copies data from one database node to others. It serves two purposes: high availability (failover if primary dies) and read scaling (distribute reads across replicas).

Primary-Replica Replication

All writes go to the primary. The primary replicates changes to one or more replicas. Reads can be served from any replica.

Synchronous: primary waits for replica to confirm before acknowledging write. No data loss but higher write latency.
Asynchronous: primary acknowledges immediately, replicates in background. Lower latency but risk of data loss if primary dies before replication completes.

Replication Lag

Async replication creates lag — replicas may be seconds behind. This means reading your own writes from a replica may return stale data. Solution: route reads for a user to the same replica they just wrote to (read-after-write consistency), or read from primary for critical operations.

Multi-Master Replication

Multiple nodes accept writes simultaneously. Solves write availability but introduces conflict resolution complexity. Two nodes accepting conflicting writes to the same record must resolve the conflict — last-write-wins, CRDTs, or application-level resolution. Use only when write availability is more important than simplicity.