Estimation
Assumptions#
New keys per day: 1B (1 billion new key-value pairs written daily)
Key size: 16 bytes (string key)
Value size: 1 KB (opaque byte blob — store doesn't parse it)
Total per record: ~1 KB (key + metadata are negligible vs value)
Read:write ratio: 10:1 (read-heavy, but not as extreme as URL Shortener)
Peak multiplier: 3× (infrastructure has smoother traffic than consumer apps)
Retention: 10 years (data stored permanently, planning horizon of 10 years)
Replication factor: 3 (standard for durability in distributed DBs)
Why 1 KB for value size? Most KV store use cases — user profiles, session data, feature flags, IoT readings — produce small values. DynamoDB caps items at 400 KB, but the average item is well under 1 KB. Using 1 KB is a defensible, commonly used assumption in interviews.
Why 10:1 read:write ratio? A KV store serving something like user profiles or config data gets read far more than it gets updated. 10:1 is moderate — not as extreme as a URL Shortener (1000:1) because KV stores also serve write-heavy workloads. If the interviewer says "optimize for writes," you'd flip this closer to 1:1.
Why 3× peak, not 10×? A general-purpose KV store is infrastructure — many services hit it with relatively smooth traffic patterns. A 10× spike makes sense for consumer apps (WhatsApp on New Year's Eve), but infrastructure behind a load balancer across many services smooths out individual spikes. 3× is the standard multiplier.
QPS#
Writes/day: 1B
Avg write QPS: 1B / 86,400 ≈ ~11.5K → round to 10K writes/sec
Peak write QPS: 10K × 3 = 30K writes/sec
Avg read QPS: 10K × 10 = 100K reads/sec
Peak read QPS: 100K × 3 = 300K reads/sec
At 300K peak reads/sec, a single machine can't handle this — even a well-tuned server tops out around 50K–100K simple reads/sec. This immediately tells us we need multiple nodes handling reads, which means partitioning (consistent hashing) and replication.
At 30K peak writes/sec, a single machine also can't keep up with durable disk writes at this rate. Writes need to be distributed across nodes too.
Storage#
Data per day: 1B × 1 KB = 1 TB/day
Data per year: 1 TB × 365 ≈ 400 TB/year (rounded up)
10-year total (raw): 400 TB × 10 = 4 PB
With 3× replication: 4 PB × 3 = 12 PB total storage
12 PB is massive. If a single machine holds 10 TB of data comfortably, we need:
This means consistent hashing and automatic data partitioning aren't optional — they're the only way to manage data placement across 1,200+ nodes. You can't manually assign which key goes where. The ring does it for you, and virtual nodes (vnodes) keep the distribution even.
Bandwidth#
Reads (peak):
300K/sec × 1 KB = 300 MB/s × 8 = 2.4 Gbps
Writes (peak):
30K/sec × 1 KB = 30 MB/s × 8 = 240 Mbps
2.4 Gbps peak read bandwidth is well within a standard 10 Gbps NIC (and this is spread across many nodes, not a single machine). Bandwidth is not a bottleneck for this system.
Summary#
Avg write QPS: 10K/sec | Peak: 30K/sec
Avg read QPS: 100K/sec | Peak: 300K/sec
Read:write ratio: 10:1
Storage (10 yrs): 4 PB raw → 12 PB with replication
Nodes needed: ~1,200+ (at 10 TB per machine)
Peak bandwidth: 2.4 Gbps reads, 240 Mbps writes → not a bottleneck
Architecture decisions this forces#
300K peak reads/sec → single node can't serve this — must partition reads
across nodes (consistent hashing + replication)
30K peak writes/sec → single node can't absorb this — need distributed
writes with LSM Tree (memtable absorbs bursts)
12 PB total storage → ~1,200 nodes minimum — consistent hashing with
vnodes for automatic data placement
1 KB value size → small enough to store inline (no separate blob store
needed, unlike Pastebin's 10 KB pastes)
Interview framing
"1 billion new keys per day, 1 KB average value size. 10:1 read:write ratio gives us 10K writes/sec and 100K reads/sec average — 30K and 300K at 3× peak. Over 10 years with 3× replication, that's 12 PB — about 1,200 machines at 10 TB each. Bandwidth is 2.4 Gbps peak reads, well within modern NICs. The key takeaway: at this scale, consistent hashing for partitioning and LSM Trees for write throughput aren't optional — they're survival requirements."