Distributed Caching — Interview Cheatsheet#

The standard answer for "how do you scale your cache?"#

"I'd distribute the cache across multiple nodes using consistent hashing so adding or removing nodes only remaps ~1/N of keys. Each node has a replica for availability — Redis Sentinel handles automatic failover. For extreme read volume, I'd add a local in-process L1 cache on each app server to absorb the hottest keys without a network round trip."

One-line definitions#

Consistent Hashing

Keys and nodes mapped to a ring. Each key routes to the first node clockwise. Adding/removing a node remaps ~1/N of keys instead of ~80%.

Virtual Nodes

Each physical node gets 150-200 positions on the ring for even key distribution. Prevents one node owning a disproportionate share of the keyspace.

Cache Coherence

Keeping multiple cache replicas in sync. Async replication is the default — brief stale window. Sync replication is consistent but slow.

Two-Level Caching (L1 + L2)

Local in-process cache (nanoseconds, per-server) + Redis (1ms, shared). L1 eliminates Redis round trips for the hottest keys.

Key numbers to know#

L1 local cache   → nanoseconds
L2 Redis         → ~1ms
DB query         → ~10ms+

Single Redis node capacity → typically 32-64GB RAM
Single Redis node throughput → ~100,000 ops/sec

Failure scenarios and answers#

Failure	Impact	Fix
Cache node dies, no replicas	~1/N of keys become DB misses	Consistent hashing limits blast radius
Cache node dies, with replicas	Zero impact — replica promotes	Redis Sentinel handles automatically
Mass key expiry	Cache miss spike → DB hammered	TTL jitter, Refresh-Ahead
Scale out (add node)	~1/N of keys remap	Consistent hashing minimises disruption
Cold start on recovery	Keys remap back to empty node → misses	Warm up before returning to live traffic