Cache Replication#

Read replicas serve two purposes — availability and read throughput. The trade-off is replication lag.

Why replicate the cache#

Availability:

Primary cache node goes down
Without replication: entire cache gone → all requests hit DB → DB collapses
With replication:    replica promotes → cache stays up → DB protected ✓

Read throughput:

10,000 cache reads/second on one node → node CPU saturated
With 3 replicas: ~3,333 reads/second each → headroom restored ✓
Primary not bottlenecked, replicas absorb read load

Replication lag#

With async replication (the default), replicas are slightly behind the primary. For cache data, this is almost always acceptable:

User updates setting → primary updated → replica 50ms behind
→ user's next read might hit replica → sees old value for 50ms
→ invisible to the user in practice

The same read-your-own-writes issue exists here as in DB replication. The same fix applies: route a user's reads to the primary for a short window after they write.

Redis-specific replication#

Redis Sentinel — monitors primary, auto-promotes a replica on failure:

Primary dies → Sentinel detects after ~10s → promotes best replica → redirects clients
~10-30 second failover window ← brief cache miss period

Redis Cluster — sharding + replication together:

Keyspace divided into 16,384 hash slots
Each slot group has a primary + replicas
Reads/writes route to the correct slot group automatically
Node failure → slot group's replica promotes → minimal disruption

Interview answer

"I'd run Redis with at least one replica per node. Redis Sentinel handles automatic failover — if the primary dies, Sentinel promotes a replica within 10-30 seconds. During that window the cache is unavailable for that key range, but the system can fall back to the DB."