Skip to content

URL Shortener Scale Estimation#

The goal of estimation

Estimation is not about getting exact numbers. It is about understanding the scale of the system so every design decision that follows is justified. A single machine or a thousand? Cache or no cache? One DB or sharded? Estimation answers all of that.


Assumptions — always state these out loud#

Before touching a single number, state your assumptions explicitly. The interviewer needs to follow your reasoning.

MAU         → 100M users
DAU         → 30% of MAU are daily active = 30M DAU
URL creators → 30% of DAU create URLs = 10M users/day
URLs/user   → 3 URLs per user per day

Write QPS#

URLs created per day = 10M users × 3 URLs = 90M/day
Seconds in a day     = 86,400 ≈ 10^5 (round up for easier math)

Write QPS = 90M / 100,000 = 900 writes/second ≈ 1k writes/second

Read QPS#

URL shorteners are extremely read-heavy. A single viral link can be clicked millions of times. A ratio of 100x reads to writes is a reasonable assumption.

Read QPS = 1k × 100 = 100k reads/second (average)

Average vs peak

100k/sec is the average. URL shorteners have massive traffic spikes — a celebrity tweets a link and 10x traffic hits in seconds. Peak QPS can be 1M+/sec. This is why caching becomes critical — you cannot hit the database on every redirect at peak load.


Storage#

Each URL entry stores:

Short URL code  →  ~50 bytes
Long URL        →  ~250 bytes  (average URL length)
ID + metadata   →  ~200 bytes  (timestamps, user info, expiry)

Total per entry →  ~500 bytes

Writes per day  = 90M entries/day
Writes per year = 90M × 365 ≈ 30B entries/year
Peak year       = ~50B entries (buffer for growth)

Storage per year = 50B × 500 bytes = 25,000 GB = 25TB/year
Storage for 10 years = 250TB

Common mistake

Do not confuse the number of records with the storage size. 50 billion records × 500 bytes = 25TB — not 50GB. Always multiply record count by record size.

250TB over 10 years cannot fit on a single machine. This tells you upfront that the database will need to be sharded. You don't design sharding now — but you flag it so the interviewer knows you see it coming.


Bandwidth#

On every redirect, the system sends the long URL back over the network.

Read QPS        = 100k requests/second
Payload per req = ~300 bytes (long URL + headers)

Bandwidth = 100k × 300 bytes = 30 MB/s = 240 Mbps

240 Mbps is well within the range of modern infrastructure. Bandwidth is not a bottleneck for this system.


Summary#

Metric Value
Write QPS ~1k/sec
Read QPS ~100k/sec (avg), ~1M/sec (peak)
Storage ~25TB/year
Storage (10 years) ~250TB
Bandwidth ~240 Mbps

Key implications: - Read-heavy → caching is essential - 250TB over 10 years → DB sharding required - Viral spikes → design must handle 10x peak load