10

Concepts

Distributed Systems

The failure modes, trade-offs, and coordination problems that emerge the moment you have more than one node.

Why distributed systems are fundamentally harder than single-node ones — partial failures, network uncertainty, and no shared clock.

Partial FailuresNetwork UncertaintyNo Global ClockFallacies

Network Partitions

What happens when nodes can't talk. Split brain, quorum decisions, and why partition handling defines your entire consistency strategy.

Split BrainQuorumFencingPartition Recovery

Consistency vs availability when a partition happens. CP vs AP — and why every distributed system is forced to choose during a split.

CAPCP SystemsAP SystemsPartition Tolerance

Extends CAP to cover normal operation. Even without a partition, you still trade latency against consistency — PACELC names that trade-off.

PACELCLatency vs ConsistencyPA/ELPC/EC

Consistency Models

Strong, eventual, causal, and monotonic. What "up to date" means in a distributed system and when each model is the right call.

Strong ConsistencyEventualCausalMonotonic

Sync vs async replication, replication lag, failover, and multi-primary — how data stays consistent across nodes.

Sync vs AsyncReplication LagFailoverMulti-Primary

Shard keys, consistent hashing, cross-shard joins, resharding, and the over-sharding trap that kills performance.

Shard KeyConsistent HashingReshardingCross-Shard Joins

Raft leader election, log replication, Paxos phases, ZooKeeper, and Redis distributed locks — how nodes agree on a single value.

RaftPaxosLeader ElectionZooKeeper

Distributed Clocks

Clock drift, NTP, Lamport clocks, vector clocks, and TrueTime — how distributed systems reason about time without a shared clock.

Lamport ClocksVector ClocksNTPTrueTime

Conflict-free replicated data types — the math that lets replicas merge without coordination, and why locks fail at global scale.

G-CounterOperational TransformOT vs CRDTMerge Semantics

Failure Detection

Heartbeats, gossip protocol, and the phi accrual failure detector — how nodes know when other nodes are dead.

HeartbeatsGossip ProtocolPhi AccrualTimeouts

Hash trees for efficient data comparison across replicas. Anti-entropy repair and how Dynamo-style systems detect divergence.

Hash TreesAnti-EntropyReplica SyncBuckets

Coordination Services

etcd, leases, TTL-based fencing tokens, and the difference between distributed locks and job tracking.

etcdLeasesFencing TokensLock vs Job Tracking