06
Concepts
Distributed Systems
What changes when your system runs on more than one machine. The problems, the algorithms, and the guarantees that make distributed computing tractable.
Why Distributed Systems Are Hard
The Two Generals Problem and network partitions. The fundamental uncertainty that makes distributed computing different from everything that came before.
Two GeneralsNetwork UncertaintyPartial FailureSplit Brain
Open topic →
Consistent Hashing
How to distribute data across nodes so that adding or removing a node only moves a fraction of keys. The algorithm behind Cassandra, DynamoDB, and every CDN.
Hash RingVirtual NodesMinimal DisruptionHotspots
Open topic →
Replication Strategies
Leader-follower, multi-leader, leaderless. The tradeoffs between write availability, consistency, and the complexity of conflict resolution.
Leader-FollowerMulti-LeaderLeaderlessQuorum
Open topic →
Idempotency
Making operations safe to retry. The pattern that turns at-least-once delivery into exactly-once behavior without a coordination service.
Idempotency KeysSafe RetriesDeduplicationAt-Least-Once
Open topic →
Delivery Guarantees
At-most-once, at-least-once, exactly-once at the distributed systems level — how these guarantees compose across network hops and node failures.
At-Most-OnceAt-Least-OnceExactly-OnceACK Semantics
Open topic →
Distributed Transactions
The problem of atomic commits across multiple nodes. 2PC, Saga choreography, Saga orchestration — and when each one is the right answer.
2PCSagaChoreographyOrchestration
Open topic →
Consensus
Raft, Paxos, and ZooKeeper. How distributed nodes agree on a single value despite failures — the algorithm that underpins leader election and log replication.
RaftPaxosLeader ElectionLog Replication
Open topic →
Distributed Clocks
Clock drift, NTP, Lamport clocks, vector clocks, and TrueTime. Why wall clocks can't order events in a distributed system and what to use instead.
Clock DriftLamport ClocksVector ClocksTrueTime
Open topic →
CRDTs
Conflict-free Replicated Data Types. Data structures that merge concurrent updates automatically — no locks, no consensus, no conflicts.
G-CounterOperational TransformOT vs CRDTConvergence
Open topic →
Failure Detection
Heartbeats, gossip protocol, and the Phi Accrual Failure Detector. How nodes decide a peer is dead without being certain — and the cost of getting it wrong.
HeartbeatsGossip ProtocolPhi AccrualFalse Positives
Open topic →
Merkle Trees
Hash trees for detecting data inconsistency across replicas. How Cassandra and DynamoDB use Merkle Trees to run anti-entropy repair efficiently.
Hash TreeAnti-EntropyReplica SyncBucket Hashing
Open topic →
Coordination Services
etcd, leases, TTL, fencing tokens, lock vs job tracking. The primitives that let distributed services elect leaders and claim exclusive work safely.
etcdLeasesFencing TokensDistributed Lock
Open topic →