10

Concepts

Distributed Systems

The failure modes, trade-offs, and coordination problems that emerge the moment you have more than one node.

The Problem
Why distributed systems are fundamentally harder than single-node ones — partial failures, network uncertainty, and no shared clock.
Partial FailuresNetwork UncertaintyNo Global ClockFallacies
Open topic
Network Partitions
What happens when nodes can't talk. Split brain, quorum decisions, and why partition handling defines your entire consistency strategy.
Split BrainQuorumFencingPartition Recovery
Open topic
CAP Theorem
Consistency vs availability when a partition happens. CP vs AP — and why every distributed system is forced to choose during a split.
CAPCP SystemsAP SystemsPartition Tolerance
Open topic
PACELC
Extends CAP to cover normal operation. Even without a partition, you still trade latency against consistency — PACELC names that trade-off.
PACELCLatency vs ConsistencyPA/ELPC/EC
Open topic
Consistency Models
Strong, eventual, causal, and monotonic. What "up to date" means in a distributed system and when each model is the right call.
Strong ConsistencyEventualCausalMonotonic
Open topic
Replication
Sync vs async replication, replication lag, failover, and multi-primary — how data stays consistent across nodes.
Sync vs AsyncReplication LagFailoverMulti-Primary
Open topic
Sharding
Shard keys, consistent hashing, cross-shard joins, resharding, and the over-sharding trap that kills performance.
Shard KeyConsistent HashingReshardingCross-Shard Joins
Open topic
Consensus
Raft leader election, log replication, Paxos phases, ZooKeeper, and Redis distributed locks — how nodes agree on a single value.
RaftPaxosLeader ElectionZooKeeper
Open topic
Distributed Clocks
Clock drift, NTP, Lamport clocks, vector clocks, and TrueTime — how distributed systems reason about time without a shared clock.
Lamport ClocksVector ClocksNTPTrueTime
Open topic
CRDTs
Conflict-free replicated data types — the math that lets replicas merge without coordination, and why locks fail at global scale.
G-CounterOperational TransformOT vs CRDTMerge Semantics
Open topic
Failure Detection
Heartbeats, gossip protocol, and the phi accrual failure detector — how nodes know when other nodes are dead.
HeartbeatsGossip ProtocolPhi AccrualTimeouts
Open topic
Merkle Trees
Hash trees for efficient data comparison across replicas. Anti-entropy repair and how Dynamo-style systems detect divergence.
Hash TreesAnti-EntropyReplica SyncBuckets
Open topic
Coordination Services
etcd, leases, TTL-based fencing tokens, and the difference between distributed locks and job tracking.
etcdLeasesFencing TokensLock vs Job Tracking
Open topic