Redis Failure — Profile Cache
Profile cache failure — preventing a DynamoDB cascade
When the profile cache Redis goes down, every profile read misses and falls through to DynamoDB. Without protection, this creates a thundering herd on DynamoDB that could take it down too. Request coalescing prevents the cascade.
The cascade risk#
The profile cache Redis goes down. Every inbox load now needs to fetch 20 profiles from DynamoDB instead of Redis. At peak:
5,500 users/second opening inbox × 20 profiles = 110,000 profile reads/second
× 20% unique = 22,000 unique profile reads/second hitting DynamoDB
We showed earlier that 22,000 RPS is within DynamoDB's capacity during cold start. But the profile cache failure is worse than a cold start — it happens suddenly, with no warm-up period, potentially during peak traffic.
More critically: if multiple users open their inbox at the same moment and all need Charlie's profile, without coalescing that's thousands of identical DynamoDB reads for the same row simultaneously.
Request coalescing — the singleflight pattern#
Request coalescing ensures that for any given profile key, only one DynamoDB read is in-flight at any moment. All other requests for the same key wait for that one result.
10,000 inbox loads all need user:charlie → cache miss (Redis down)
→ Thread 1: no in-flight request for charlie → starts DynamoDB fetch
→ Threads 2-9999: in-flight request exists → attach to Thread 1's future
→ DynamoDB returns Charlie's profile
→ All 10,000 threads get the result simultaneously
→ Result: 1 DynamoDB read, not 10,000
Implementation in Java using CompletableFuture:
ConcurrentHashMap<String, CompletableFuture<UserProfile>> inFlight
= new ConcurrentHashMap<>();
public CompletableFuture<UserProfile> getProfile(String userId) {
// Check cache first
UserProfile cached = redis.get("user:" + userId);
if (cached != null) {
return CompletableFuture.completedFuture(cached);
}
// Cache miss — coalesce in-flight requests
return inFlight.computeIfAbsent(userId, key ->
fetchFromDynamo(key)
.whenComplete((result, ex) -> {
inFlight.remove(key);
if (result != null) {
redis.set("user:" + key, result); // re-cache if Redis recovers
}
})
);
}
computeIfAbsent is atomic — only one thread creates the CompletableFuture for a given key. Every other thread calling computeIfAbsent with the same key gets back the existing future and waits on it.
Why this is per app server, not distributed#
The in-flight map lives in the app server's memory. It deduplicates requests within one server — not across all servers.
If you have 100 app servers, each one might independently send one request to DynamoDB for user:charlie. That's 100 reads instead of 10,000 — a 100x reduction, which is enough to keep DynamoDB safe.
A distributed coalescing layer (e.g. a shared cache in front of DynamoDB) would reduce it further to 1 read globally, but the per-server approach is simpler and sufficient at this scale.
Circuit breaker as the last line of defence#
If DynamoDB starts struggling despite coalescing — error rate rises above 1% — the circuit breaker opens. New profile reads fail fast, the app server returns a degraded response (inbox loads without profile names/avatars), and DynamoDB is protected from further load.
Profile cache down → coalescing limits DynamoDB reads → DynamoDB survives
If DynamoDB still struggles → circuit breaker opens → fail fast → DynamoDB protected
Interview framing
"Profile cache failure triggers a cold start on DynamoDB. Request coalescing (singleflight pattern) means each unique profile generates at most one DynamoDB read per app server at any moment — reducing 10,000 simultaneous reads to ~100 across the fleet. If DynamoDB still struggles, the circuit breaker opens as a last resort."