User Data DB
What Is User Data#
Content metadata describes movies and shows. User data describes everything tied to a specific person — what they watched, how far they got, what they rated, what their watch history looks like. This is a completely different access pattern and a completely different DB problem.
The two main categories:
Watch history → which movies/episodes has this user watched or started
Resume position → exactly where in a video did the user stop watching
Watch History — Access Patterns#
When a user logs in, Netflix needs to: - Show recently watched titles on the homepage - Hide already-watched content from recommendations - Show ratings the user previously gave
Every query is user-first: "give me everything for this user." The relationship is simple — one user to many movies consumed. A single join table handles it:
user_watch_history
──────────────────────────────────────────────────
user_id → 8 bytes
movie_id → 8 bytes
rating → 4 bytes
metadata → 50 bytes (watched_at, completed, etc.)
resume_ts → 8 bytes
──────────────────────────────────────────────────
total ≈ 100 bytes per row
Sizing Watch History — Do We Need Sharding?#
Netflix has 300M MAU. Assume 500M total registered users. Users watch roughly 10 episodes or movies per month — about 0.33 per day. Not all registered users are active daily — assume 150M DAU.
Daily writes = 150M users × 0.33 records/day = ~50M records/day
Record size = 100 bytes
Daily storage = 50M × 100B = 5,000MB = 5 GB/day
180 days = 5GB × 180 = 900 GB
After 180 days, older watch history matters less for active recommendations — a movie a user watched 2 years ago has less signal than what they watched last week. So data older than 180 days moves to cold storage.
Hot tier → PostgreSQL → last 180 days → ~900 GB → fast queries, recent history
Cold tier → S3 → older data → cheap object storage, batch ML processing
900 GB is large but not unmanageable for PostgreSQL with a read replica. Sharding is not needed.
Why keep cold data at all
Old watch history is not useful for real-time queries but is extremely valuable for Netflix's recommendation ML models — which train on years of viewing behaviour. S3 is cheap enough to store it indefinitely for batch processing without it ever touching the live DB.
Write Volume for Watch History#
578 writes/second is well within PostgreSQL's comfortable limit of ~10,000 writes/second on a single node. No problem here.
PostgreSQL handles watch history.
Resume Timestamps — A Completely Different Problem#
Here is where the analysis changes entirely.
Every time a user is actively watching a video, Netflix must continuously record where they are in it. If the user closes the app mid-episode and reopens it later, it resumes from the exact frame they left at. This means Netflix writes the user's current playback position every few seconds — not once per viewing session, but continuously throughout it.
The write happens once per chunk consumed. Each chunk is 4 seconds. So every active user generates one write every 4 seconds.
Sizing Resume Writes — Where PostgreSQL Breaks#
Netflix has 150M DAU. Apply the 80-20 rule — 20% are streaming at peak hour:
Peak concurrent streamers = 150M × 20% = 30M users
Each user writes every 4 seconds
Writes/second = 30M ÷ 4 = 7,500,000 writes/second
7.5 million writes per second. PostgreSQL handles ~10,000 writes/second on a single node. Even with aggressive sharding, you would need thousands of PostgreSQL nodes to absorb this load — operationally nightmarish.
This is not a storage problem. 900 GB of watch history was fine on PostgreSQL. This is a write throughput problem — a fundamentally different constraint that requires a fundamentally different tool.
Why PostgreSQL fails for resume timestamps
It is not about data size — the resume record itself is tiny (user_id + movie_id + timestamp = ~24 bytes). It is about write frequency. 7.5M writes/second is 750x PostgreSQL's single-node limit. No amount of read replicas helps — read replicas only scale reads, not writes.
Cassandra for Resume Timestamps#
Cassandra is built for exactly this workload — high write throughput, distributed by design, no joins needed. It handles millions of writes per second across a cluster of nodes with linear horizontal scaling.
The access pattern for resume timestamps is simple:
Write → user watches chunk → store (user_id, movie_id, timestamp)
Read → user opens app → fetch timestamp WHERE user_id = X AND movie_id = Y
No joins. No complex queries. Just write fast and read by key.
Partition Key and Clustering Key Design#
Cassandra distributes data across nodes using a partition key. The choice of partition key determines which rows land on which node — and a bad choice creates hot partitions where one node handles a disproportionate share of traffic.
Option 1 — movie_id as partition key:
All resume timestamps for Inception land on the same partition. Inception has millions of viewers. One node absorbs all writes for every popular title simultaneously.
movie_id = Inception → millions of users → one hot partition
→ one node overwhelmed on every popular release
Option 2 — user_id as partition key:
All resume timestamps for one user land on the same partition. One user watches a handful of movies at a time — maybe 5-10 active titles. The partition stays small. Consistent hashing distributes 500M users evenly across all nodes.
user_id = any user → handful of movies → small, evenly distributed partition
→ no hot partition regardless of what title is popular
user_id is the correct partition key. movie_id becomes the clustering key — it sorts rows within the partition and enables the query "for this user, give me the resume position for this specific movie."
Partition Key → user_id (distributes load evenly across nodes)
Clustering Key → movie_id (enables per-movie lookup within the partition)
Why hot partition matters
In Cassandra, a partition lives on one node. If that partition is hot — receiving millions of writes per second — that single node becomes the bottleneck regardless of how many other nodes are in the cluster. Choosing user_id as the partition key ensures that the load of a popular title (Squid Game season drop, 30M concurrent viewers) is spread across 500M user partitions on many different nodes — not concentrated on one partition.
Final DB Summary for User Data#
Watch history → PostgreSQL
→ 900 GB / 180 days hot tier
→ S3 cold tier beyond 180 days
→ 578 writes/second — within PostgreSQL limits
Resume timestamps → Cassandra
→ 7.5M writes/second at peak — requires Cassandra
→ Partition key: user_id
→ Clustering key: movie_id
The key insight: watch history and resume timestamps look similar on the surface — both are user-to-movie records. But resume timestamps have 13,000x higher write frequency than watch history. That single difference changes the entire DB choice.