Log Replication Failures
The core idea
Raft only tells the client "success" after a majority of nodes have the write. This single rule is what makes committed data safe across every failure scenario.
How a write works in Raft#
Every write goes through exactly these steps before the client gets a response:
Step 1 → Leader writes entry to its own WAL (uncommitted)
Step 2 → Leader sends entry to followers via AppendEntries RPC
Followers write entry to their WAL (uncommitted)
Step 3 → Leader receives majority acks → marks entry committed in its WAL
Step 4 → Leader applies entry to state machine
Step 5 → Leader replies "success" to client
Step 6 → Leader sends commit notification to followers → they commit + apply too
The client only hears "success" at Step 5 — after majority has the entry and the leader has committed. This is the guarantee everything else is built on.
Case 1 — Leader crashes before replicating (between Step 1 and Step 2)#
Leader wrote to its own WAL but crashed before sending to any follower. No follower has any trace of this entry.
sequenceDiagram
participant C as Client
participant A as Leader (A)
participant B as Follower B
participant C2 as Follower C
C->>A: Write request
A->>A: Write to WAL (uncommitted)
Note over A: Crashes here ✗
Note over B,C2: No entry at all
B->>B: Timeout → start election (Term 2)
B-->>C2: Request vote
C2-->>B: Vote granted
Note over B: New leader (Term 2)
A->>A: Comes back → sees Term 2 → steps down as follower
B->>A: Sync → A's uncommitted entry discarded Result: Entry discarded. Client got no response, so it retries. Idempotency handles the duplicate.
Case 2 — Leader crashes after replicating but before committing (between Step 2 and Step 3)#
Leader sent the entry to followers. Followers wrote it to their WAL as uncommitted. Leader crashes before getting majority acks.
sequenceDiagram
participant C as Client
participant A as Leader (A)
participant B as Follower B
participant C2 as Follower C
C->>A: Write request
A->>A: Write to WAL (uncommitted)
A->>B: AppendEntries (uncommitted)
A->>C2: AppendEntries (uncommitted)
B->>B: Write to WAL (uncommitted)
C2->>C2: Write to WAL (uncommitted)
Note over A: Crashes before getting acks ✗
B->>B: Timeout → start election (Term 2)
B-->>C2: Request vote
C2-->>B: Vote granted
Note over B: New leader (Term 2)
Note over B: Sees uncommitted entry in WAL
Note over B: Majority has it → safe to auto-commit
B->>B: Commits entry
B->>C2: Commit notification → C2 commits
A->>A: Comes back → B syncs A → A commits too Result: Entry is saved. New leader sees the uncommitted entry on majority nodes and auto-commits it. Client retries (got no response) — idempotency handles the duplicate.
Why can the new leader auto-commit?
Because it was elected by majority — meaning majority already has this entry in their WAL. It's safe to commit because the data won't disappear even if another node fails.
Case 3 — Leader crashes after committing but before notifying followers (between Step 5 and Step 6)#
Leader committed the entry and replied "success" to the client. Was about to send the commit notification to followers — crashes right here. Followers still have the entry as uncommitted.
sequenceDiagram
participant C as Client
participant A as Leader (A)
participant B as Follower B
participant C2 as Follower C
C->>A: Write request
A->>A: Write to WAL (uncommitted)
A->>B: AppendEntries
A->>C2: AppendEntries
B->>B: Write to WAL (uncommitted)
C2->>C2: Write to WAL (uncommitted)
B-->>A: ACK
C2-->>A: ACK
A->>A: Majority acks → commits entry
A-->>C: "success"
Note over A: Crashes before sending commit notification ✗
B->>B: Timeout → start election (Term 2)
Note over B: New leader (Term 2)
Note over B: Sees uncommitted entry on majority → auto-commits
B->>C2: Commit notification → C2 commits
A->>A: Comes back → B syncs A → A commits too Result: Identical outcome to Case 2. New leader auto-commits the pending entry. Client already got "success" and does not retry.
The pattern across all three cases#
| When leader crashes | Followers have | Outcome |
|---|---|---|
| Before replicating (Step 1→2) | Nothing | Entry discarded. Client retries. |
| After replicating, before commit (Step 2→3) | Uncommitted entry on majority | New leader auto-commits. Client retries harmlessly. |
| After committing, before notifying (Step 5→6) | Uncommitted entry on majority | New leader auto-commits. Client already got success. |
Committed data is never lost
"Committed" in Raft means majority acknowledged the entry. The new leader is always elected from majority — so committed data always survives, even permanent leader failure. The only exception is losing more than ⌊N/2⌋ nodes permanently at the same time.
What if a follower missed some entries?#
Say Follower C was down during several writes and just came back. Its log is behind.
Leader: [1, 2, 3, 4, 5, 6]
Follower C: [1, 2, 3, 4]
Follower C rejects index 6 → "I only have up to index 4"
Leader → sends index 5 → C applies → sends index 6 → C applies
Follower C: [1, 2, 3, 4, 5, 6] ✓
Followers never skip entries. They always catch up sequentially. No gaps allowed — ever.