Replication Goals: Availability, Throughput, Geolocation
What This Concept Is
Replication means keeping a copy of the same data on more than one machine. "Replication" is a single word for at least three different goals, and every topology decision later in this module depends on which goal you are optimizing for.
The three canonical reasons to replicate:
- Availability: if one replica is unreachable (crashed, rebooting, network-isolated), another can still answer. The system survives single-node failure.
- Read throughput: reads served from any replica scale linearly with replica count; one write becomes
Nreadable copies. Read-heavy workloads (analytics dashboards, news feeds, static catalogs) benefit most. - Geolocation: placing a replica near the user reduces read latency from hundreds of milliseconds (trans-oceanic) to a few milliseconds (same-region). Every large service with a global audience is geo-replicated for this reason.
Each goal pulls in a different direction. Availability wants fast failover. Throughput wants many cheap read replicas. Geolocation wants replicas spread across the globe, which makes synchronous replication painful because of the speed of light.
Why It Matters Here
The later clusters pick specific topologies (single-leader, multi-leader, leaderless) and log formats (statement, logical, physical). None of those choices make sense unless you know which of the three goals you were trying to satisfy. A database architect who replicates "because that's what you do" ends up with a system that adds complexity without buying any specific guarantee.
You also inherit a tax: every replica is another failure domain, another clock to manage, another place a partial write can disagree with the rest of the cluster.
Concrete Example
Consider a customer-facing web app serving three user populations:
- US users reading order history: replication serves throughput (one leader, five read followers in US-East).
- Retail staff writing orders during a rack reboot: replication serves availability (automated failover from a failed leader to a hot standby).
- EU users browsing the same catalog: replication serves geolocation (a read replica in
eu-west-1so European browsers see <20 ms TTFB instead of >150 ms).
Three goals, three replicas, all labelled "replication." The architect should be able to point at every replica and name which goal it exists for.
Common Confusion / Misconception
"Replication gives me high availability." Not automatically. Replication gives you a copy of the data on another machine. Turning that into availability requires failover tooling, client retry logic, and a way to decide which replica is authoritative right now. An asynchronous replica can be minutes behind; promoting it does not restore the last minute of writes.
"Replication scales writes." Only leaderless or multi-leader does, and both pay for it with conflicts and weak consistency. Classical single-leader replication scales reads. Writes still serialize through one node.
How To Use It
When someone proposes replication, ask:
- Which goal: availability, throughput, or geolocation?
- Is the data read-heavy (replication helps) or write-heavy (replication mostly hurts)?
- How stale may a replica be? Milliseconds? Seconds? "Eventually"?
- What happens on the minority side of a network partition? Can reads continue? Can writes?
- How do clients discover the current leader (or any healthy replica)?
If the proposer cannot answer those five questions, the design is not finished.
Check Yourself
- Give three distinct reasons to replicate that do not reduce to each other.
- Why does geo-replication make synchronous replication expensive?
- Why does "we have replicas" not imply "we have high availability"?
- Which workload benefits least from read replicas?
Mini Drill or Application
For each system, name the primary replication goal (A/T/G) and sketch the topology:
- A banking ledger with one write region in
us-east-1, secondary read replicas in the same region. - A CDN-backed catalog for a global retailer with users in 40 countries.
- A chat application where every user expects to see their own message appear within 100 ms regardless of location.
- A GitHub-style code-search index where a single stale-by-1-second view is acceptable.
- A stock-exchange matching engine that cannot tolerate any stale reads.