Fan-Out on Read
Fan-out on read means you do nothing at post time. When a follower opens the app, you fetch the posts live at that moment. No pre-computation, no upfront writes — the feed is assembled on demand.
Why fan-out on write breaks for celebrities#
Kylie Jenner has 400 million followers on Instagram. She posts a photo.
Fan-out on write would mean:
400 million DB inserts triggered instantly
→ DB gets hammered with a write spike
→ Other users' requests slow down or fail
→ System destabilised by one celebrity post
This is completely unacceptable. You can't let one user's post bring down your infrastructure.
The fan-out on read approach#
Do nothing at post time. Just save the post to the DB.
graph TD
K[Kylie posts] --> PDB[(Posts DB)]
B[Bob opens app] --> FS[Feed Service]
FS --> FDB[(Followers DB)]
FDB -->|who does Bob follow?| FS
FS --> PDB
PDB -->|fetch recent posts from followed accounts| FS
FS -->|assembled feed| B Kylie posts photo
→ Save post to DB: { post_id: 999, user_id: Kylie, created_at: now }
→ Return "Posted!" to Kylie ← done, zero fan-out work
When a follower opens Instagram:
Bob opens app
→ Feed Service: "give me Bob's feed"
→ Step 1: SELECT following_id FROM followers WHERE follower_id = Bob
→ returns [Kylie, Taylor Swift, Nike, ... everyone Bob follows]
→ Step 2: SELECT * FROM posts WHERE user_id IN (Kylie, Taylor, Nike, ...)
ORDER BY created_at DESC LIMIT 20
→ Assemble feed from results
→ Return to Bob
sequenceDiagram
participant Bob
participant FeedService
participant FollowersDB
participant PostsDB
Bob->>FeedService: open app — give me my feed
FeedService->>FollowersDB: who does Bob follow?
FollowersDB-->>FeedService: [Kylie, Taylor, Nike ...]
FeedService->>PostsDB: fetch recent posts from those accounts
PostsDB-->>FeedService: posts sorted by created_at
FeedService-->>Bob: assembled feed The feed is computed live every time Bob opens the app. No pre-computed feed table needed.
The trade-off#
Fan-out on read shifts the cost from write time to read time.
Fan-out on write:
→ Post time: expensive (N inserts)
→ Read time: cheap (one indexed lookup on feeds table)
Fan-out on read:
→ Post time: cheap (just save the post)
→ Read time: expensive (fetch following list + fetch posts from N accounts + sort + merge)
For a celebrity with 400 million followers, the write cost at post time is unbearable. But the read cost is also a concern — Bob follows 500 people, fetching posts from 500 accounts and merging them on every feed load is slow.
The fix for read cost is caching — cache the assembled feed for each user for a short window (30 seconds to a few minutes). Most users don't need a perfectly real-time feed.
The hybrid approach — what Instagram and Twitter actually do#
Neither pure fan-out on write nor pure fan-out on read works at scale. The real answer is a hybrid:
graph TD
NU[Normal user posts] --> Q[Queue]
Q --> FS1[Feed Service]
FS1 -->|fan-out on write| FT[(Feeds Table)]
CE[Celebrity posts] --> PDB[(Posts DB)]
B[Bob opens app] --> FS2[Feed Service]
FS2 -->|pre-computed entries| FT
FS2 -->|live fetch celebrity posts| PDB
FS2 -->|merge + sort| B Normal user posts (< ~10,000 followers)
→ Fan-out on write
→ Feed Service updates all followers' feeds async via queue
→ Fast reads, bounded write cost
Celebrity posts (> ~10,000 followers, verified accounts)
→ Fan-out on read
→ Just save the post, no fan-out
→ When follower opens app, fetch celebrity posts live and merge with pre-computed feed
The feed assembly for a regular user is:
Pre-computed feed entries (from fan-out on write for normal accounts followed)
+
Live fetched posts (from celebrity accounts followed, fan-out on read)
→ Merge and sort by timestamp
→ Return to user
This keeps write costs bounded for celebrities while keeping read costs low for normal users.
Full end-to-end read flow (hybrid)#
Bob follows 480 normal users and 20 celebrities.
Bob opens Instagram
→ Feed Service:
Step 1: Fetch Bob's pre-computed feed from feeds table
SELECT post_id FROM feeds WHERE user_id = Bob ORDER BY created_at DESC LIMIT 20
→ 20 posts from normal users, already sitting there
Step 2: Fetch Bob's celebrity following list
SELECT following_id FROM followers WHERE follower_id = Bob AND is_celebrity = true
→ [Kylie, Taylor, Nike, ...]
Step 3: Fetch recent posts from celebrities
SELECT * FROM posts WHERE user_id IN (Kylie, Taylor, Nike)
ORDER BY created_at DESC LIMIT 20
→ recent celebrity posts fetched live
Step 4: Merge both sets, sort by created_at, return top 20
The result feels instant to Bob because Step 1 is a fast pre-computed lookup, and Steps 2-4 hit cached data for celebrity posts.
How the two sets get merged — chronological sort#
After fetching both sets, Feed Service merges them and sorts by created_at. Kylie's post gets no special treatment — it slots into wherever it belongs based on when she posted.
Pre-computed feed (normal users): Live fetched (celebrities):
Dave 10:00 Kylie 09:50
Charlie 09:45
Bob 09:30
Merged and sorted:
1. Dave 10:00
2. Kylie 09:50 ← slides into position 2 based on timestamp
3. Charlie 09:45
4. Bob 09:30
If Kylie posted most recently, she's at the top. If she posted 3 days ago, she's buried under everything newer.
Chronological vs ranked feed#
The merge above is a chronological feed — posts ordered purely by time. Simple, predictable, what Twitter used to do.
Instagram doesn't do pure chronological anymore. They use a ranking algorithm — every candidate post gets scored by an ML model based on:
Recency → how recently was it posted?
Relationship → how often do you interact with this account?
Interest → do you usually engage with this type of content?
Post type → video vs photo vs reel
Engagement signals → how many likes/comments in the first 30 minutes?
The feed you see isn't "most recent 20 posts." It's "the 20 posts the model thinks you're most likely to engage with."
Feed ranking is a separate layer on top of the architecture. The hybrid fan-out system produces a pool of candidate posts. A ranking model then scores and reorders them. In a system design interview, describe the architecture first (fan-out on write + fan-out on read hybrid), then mention ranking as an optional layer on top if asked how Instagram decides what to show.
sequenceDiagram
participant Bob
participant FeedService
participant FeedsTable
participant FollowersDB
participant PostsDB
Bob->>FeedService: open app — give me my feed
FeedService->>FeedsTable: fetch pre-computed feed for Bob
FeedsTable-->>FeedService: 20 posts from normal users ✓
FeedService->>FollowersDB: who are Bob's celebrity follows?
FollowersDB-->>FeedService: [Kylie, Taylor, Nike ...]
FeedService->>PostsDB: fetch recent posts from celebrities
PostsDB-->>FeedService: celebrity posts (cached)
FeedService-->>Bob: merged + sorted top 20 When to use fan-out on read#
- Accounts with very high follower counts (celebrities, brands, verified accounts)
- Systems where write cost at post time must be kept near zero
- When you can afford slightly higher read latency (offset by caching)
Interview framing: "For celebrity accounts I'd use fan-out on read — at 10 million followers, fan-out on write causes a massive DB write spike on every post. Instead I save the post and fetch it live at read time, merging with the pre-computed feed for normal accounts. I'd cache the assembled feed per user for 30-60 seconds to keep read latency acceptable."
The threshold between fan-out on write and fan-out on read is typically around 10,000 followers — but this is a tunable config, not a hard rule. In an interview, mention the threshold exists and that it's configurable based on observed write pressure.