BuildBot

Scaling Primitives

Replication & consistency

Lesson 5 of 5

What you'll learn

  • Contrast leader/follower replication with sync vs async propagation
  • State CAP honestly: under partition you choose consistency or availability
  • Use the quorum rule R + W > N to guarantee reads see the latest write

Replication keeps copies of data on multiple nodes. It buys durability (a dead node doesn't lose data), read scale (spread reads across replicas), and availability (serve through a failure). The unavoidable cost is that copies disagree for some window — the moment you have more than one copy, you own a consistency problem.

write -> [leader] --replicate--> [follower] [follower]
reads  <- any replica (may be stale)

Leader/follower, sync vs async

In leader/follower (primary/replica), all writes go to the leader, which streams changes to followers. Reads can fan out to followers to scale. The open question is when a write is considered done.

Synchronous replication waits for follower(s) to acknowledge before the write returns. Followers are guaranteed current, so a read there is fresh — but write latency now includes the slowest acked replica, and if that replica is down the write blocks. Asynchronous replication returns as soon as the leader commits locally and ships changes in the background. Writes are fast and a slow follower can't stall them, but followers lag, so follower reads can be stale and a leader crash before propagation loses the unacked writes. Sync favors consistency and durability; async favors latency and availability.

CAP and the quorum rule

CAP: when a network partition splits the cluster, you must choose. A CP system refuses requests it can't make consistent (rejects writes on the minority side) — correct but unavailable. An AP system keeps serving on both sides and reconciles later — available but temporarily inconsistent. There is no third option during a partition; "CA" only describes a system that never partitions, which a distributed system isn't.

You don't have to pick all-or-nothing. With N replicas, require W acks to commit a write and read from R replicas, taking the newest version. If R + W > N, the read set and write set must overlap by at least one node — so any read is guaranteed to touch a replica that saw the latest write. This is how Dynamo-style stores tune the consistency/availability dial per request: W = N for strong writes, W = 1 for fast writes, and R + W > N whenever a read must observe the last acknowledged write.

N = 3
W = 2, R = 2  -> R + W = 4 > 3  -> overlap guaranteed, read sees latest
W = 1, R = 1  -> R + W = 2 <= 3 -> may miss the latest write (stale read)

Overlap is the guarantee, not the node count

R + W > N works because the write quorum and read quorum are forced to share at least one node — that shared node carries the freshest value. Say "overlapping quorums" in the interview; that phrase is what's being graded.

Quorum read/write

Run it. A write to W replicas tags a version; a read of R replicas takes the newest. With R + W > N the read sees the latest write; with R + W <= N it can miss it.

Loading editor…
Knowledge check

Why does the rule R plus W greater than N guarantee a read sees the latest acknowledged write?

Saved on this device. Sign in to sync your progress everywhere.