Skip to content

What Is Sharding

You've added read replicas. Reads are fine. But writes are still bottlenecked and your disk is nearly full. What do you do?


The problem replication doesn't solve#

You're running Twitter. 1 billion users. You've already added replicas to handle read traffic. But the primary is still struggling.

Replication copies the same data to multiple servers — every server still holds all the data:

Replication:
  Primary  → all 1 billion rows
  Replica1 → all 1 billion rows  (copy)
  Replica2 → all 1 billion rows  (copy)

Reads distributed ✓
Writes? Still all go to one primary ✗
Storage? Every server stores everything ✗

You're distributing reads, not the data itself. The primary still handles every write, and every server still needs enough disk for the entire dataset.

The fix is Sharding — split the data itself across multiple servers. Each server holds only a fraction.

Sharding:
  Shard 1 → users 1 - 250M
  Shard 2 → users 250M - 500M
  Shard 3 → users 500M - 750M
  Shard 4 → users 750M - 1B

Each server now holds 250 million rows instead of 1 billion. Writes are distributed too — writes for user_id 1 go to Shard 1, writes for user_id 600M go to Shard 3.

Sharding

Horizontally partitioning data across multiple servers so each server holds a subset of rows. Unlike replication, each shard has unique data — no two shards hold the same rows.


Sharding is basically multiple primaries — with one key difference#

Each shard is its own independent primary accepting writes for its slice of data. So yes — sharding is multiple primaries at the end of the day.

But there's a critical difference from what we discussed in multi-primary replication:

In multi-primary, multiple nodes accept writes for the same data — that's where conflicts happen. User A updates the same row on Node 1 and Node 2 simultaneously, and the system has to figure out which write wins.

In sharding, each node owns a completely different slice of data — there's no overlap. Shard 1 owns users 1–250M, Shard 2 owns users 250M–500M. They will never get a conflicting write for the same row because they never share rows.

Multi-primary:
  Node A and Node B both own user_id 423 → conflict possible ✗

Sharding:
  Shard 1 owns user_id 423
  Shard 2 owns user_id 600,000,000
  → they never touch each other's data → no conflicts ✓

Sharding gives you the write scalability of multi-primary without the conflict problem — because the data is partitioned, not replicated.

And in practice, each shard usually also has its own replicas for read scalability and fault tolerance:

Shard 1 Primary → Shard 1 Replica 1
                → Shard 1 Replica 2
Shard 2 Primary → Shard 2 Replica 1
                → Shard 2 Replica 2

Sharding for write scalability. Replication within each shard for read scalability and availability. The two work together.


Vertical Partitioning — splitting by column#

Everything above is horizontal sharding — splitting rows across servers. But there's another axis: splitting by column.

Say your users table has 50 columns — name, email, bio, profile picture URL, privacy settings, notification preferences, last_login, follower_count, and so on. Most queries only need 2 or 3 of those. But every time you query a row, the database loads the entire row from disk — all 50 columns — even if you only asked for name and email.

The fix is vertical partitioning — split the wide table into narrower tables by access pattern:

users_core      → user_id, username, email          (every query needs this)
users_profile   → user_id, bio, avatar_url           (profile page only)
users_settings  → user_id, privacy, notifications    (settings page only)

Now a feed query that only needs username loads only users_core. The bio and settings columns never touch the disk. Reads get faster, I/O drops.

Vertical partitioning

Splitting a wide table into narrower tables by column access pattern. Each table holds only the columns accessed together. Related by the same primary key.

The trade-off is that queries needing data from multiple partitions now require a JOIN. But that's usually a better problem to have than loading 50 columns on every single read.


Horizontal vs Vertical — when to use each#

Horizontal sharding  → too many rows, DB can't hold them all or write fast enough
                       split: rows go to different servers

Vertical partitioning → too many columns, queries load more data than they need
                        split: columns go to different tables on same or different servers

In practice, most large systems use both — vertical partitioning to keep tables lean, horizontal sharding to distribute row volume across servers.