10
Concepts
Distributed Systems
The failure modes, trade-offs, and coordination problems that emerge the moment you have more than one node.
The Problem
Why distributed systems are fundamentally harder than single-node ones — partial failures, network uncertainty, and no shared clock.
Partial FailuresNetwork UncertaintyNo Global ClockFallacies
Open topic →
Network Partitions
What happens when nodes can't talk. Split brain, quorum decisions, and why partition handling defines your entire consistency strategy.
Split BrainQuorumFencingPartition Recovery
Open topic →
CAP Theorem
Consistency vs availability when a partition happens. CP vs AP — and why every distributed system is forced to choose during a split.
CAPCP SystemsAP SystemsPartition Tolerance
Open topic →
PACELC
Extends CAP to cover normal operation. Even without a partition, you still trade latency against consistency — PACELC names that trade-off.
PACELCLatency vs ConsistencyPA/ELPC/EC
Open topic →
Consistency Models
Strong, eventual, causal, and monotonic. What "up to date" means in a distributed system and when each model is the right call.
Strong ConsistencyEventualCausalMonotonic
Open topic →
Replication
Sync vs async replication, replication lag, failover, and multi-primary — how data stays consistent across nodes.
Sync vs AsyncReplication LagFailoverMulti-Primary
Open topic →
Sharding
Shard keys, consistent hashing, cross-shard joins, resharding, and the over-sharding trap that kills performance.
Shard KeyConsistent HashingReshardingCross-Shard Joins
Open topic →
Consensus
Raft leader election, log replication, Paxos phases, ZooKeeper, and Redis distributed locks — how nodes agree on a single value.
RaftPaxosLeader ElectionZooKeeper
Open topic →
Distributed Clocks
Clock drift, NTP, Lamport clocks, vector clocks, and TrueTime — how distributed systems reason about time without a shared clock.
Lamport ClocksVector ClocksNTPTrueTime
Open topic →
CRDTs
Conflict-free replicated data types — the math that lets replicas merge without coordination, and why locks fail at global scale.
G-CounterOperational TransformOT vs CRDTMerge Semantics
Open topic →
Failure Detection
Heartbeats, gossip protocol, and the phi accrual failure detector — how nodes know when other nodes are dead.
HeartbeatsGossip ProtocolPhi AccrualTimeouts
Open topic →
Merkle Trees
Hash trees for efficient data comparison across replicas. Anti-entropy repair and how Dynamo-style systems detect divergence.
Hash TreesAnti-EntropyReplica SyncBuckets
Open topic →
Coordination Services
etcd, leases, TTL-based fencing tokens, and the difference between distributed locks and job tracking.
etcdLeasesFencing TokensLock vs Job Tracking
Open topic →