04
Concepts
Storage & Databases
How data is stored, indexed, replicated, and queried at scale. The layer every system design eventually has to justify.
Fundamentals
Why not files? How databases actually store data — row vs column orientation, heap files, and the storage engine basics that explain everything above.
Row StorageColumn StorageHeap FilesStorage Engine
Open topic →
ACID
Atomicity, consistency, isolation, durability. Transaction isolation levels, distributed transactions, 2PC, and Saga — everything about keeping data correct under concurrency.
Isolation Levels2PCSagaACID vs BASE
Open topic →
SQL
Relational model, normalisation, denormalisation tradeoffs, joins, and views. The fundamentals before you decide SQL isn't enough.
NormalisationDenormalisationJoinsViews
Open topic →
Indexing
Hash indexes, B+ Trees, LSM Trees, and geospatial indexing. The data structures that make reads fast and writes expensive — and when each one wins.
B+ TreeLSM TreeHash IndexGeospatial
Open topic →
Replication
Sync vs async replication, replication lag, failover, and multi-primary. How you keep multiple copies of data consistent without destroying write throughput.
Sync vs AsyncReplication LagFailoverMulti-Primary
Open topic →
Sharding
Shard keys, sharding strategies, consistent hashing, cross-shard joins, resharding, and over-sharding. How you split a database that's too big for one node.
Shard KeyConsistent HashingReshardingCross-Shard Joins
Open topic →
MVCC
Multi-Version Concurrency Control. How databases serve reads without blocking writes by keeping multiple versions of the same row alive simultaneously.
MVCCSnapshot IsolationVersion ChainsGarbage Collection
Open topic →
CDC
Change Data Capture and the outbox pattern. How you stream database changes to downstream systems without dual-write bugs.
CDCOutbox PatternDebeziumLog Tailing
Open topic →
Pagination
Offset vs cursor pagination. Why offset pagination breaks at scale and how cursor-based pagination solves it without re-scanning the whole table.
Offset PaginationCursor PaginationKeyset PaginationDeep Pages
Open topic →
Connection Pooling
The real cost of a database connection and how connection pools amortize it. What happens when the pool is exhausted and how to size it correctly.
Connection CostPool SizingPool ExhaustionPgBouncer
Open topic →
Read / Write Splitting
Routing reads to replicas and writes to primary. The replication lag problem that follows, and when read/write splitting actually helps vs hurts.
Read ReplicasReplication LagStale ReadsRouting
Open topic →
Database Types
Key-value, document, column-family, search engines, graph, blob storage, NewSQL, OLTP vs OLAP. Every database type, when it wins, and when it doesn't.
RedisCassandraMongoDBElasticsearch
Open topic →
Choosing the Right DB
A decision framework and cheatsheet for picking the right database given your access patterns, consistency needs, and scale requirements.
Decision FrameworkAccess PatternsDB CheatsheetTradeoffs
Open topic →
Data Modeling
Entities, relationships, access patterns, and red flags. How to model data for a real system — with a worked Instagram schema as the example.
ER ModelingAccess PatternsInstagram SchemaRed Flags
Open topic →