Prefetch & Fairness
Prefetch controls how many unacknowledged messages RabbitMQ can send to a consumer at once before waiting for ACKs. Without it, a single fast consumer can hoard the entire queue while slower workers sit idle — or a single slow consumer can hold a huge backlog nobody else can touch.
The hoarding problem#
You have 3 inventory workers pulling from the same queue. 1000 order events are waiting.
Without prefetch, RabbitMQ pushes messages as fast as the consumer can accept them. Worker A connects first and immediately buffers hundreds of messages in memory. Workers B and C connect a second later and find almost nothing left to pick up.
No prefetch limit:
Worker A: [msg_1 ... msg_847] ← holding 847 messages
Worker B: [msg_848 ... msg_923]
Worker C: [msg_924 ... msg_1000]
Worker A is slow (each order takes 200ms to process)
Workers B and C finish their batch in 5 seconds
Workers B and C now sit idle while 800+ messages are stuck in Worker A's buffer
Those messages are unacked and in-flight — RabbitMQ won't send them to anyone else.
Setting prefetch#
This is called before basicConsume. It tells RabbitMQ: don't send this consumer a new message until it ACKs the current one.
// Full setup
channel.basicQos(1);
channel.basicConsume(queueName, false, (consumerTag, delivery) -> {
processOrder(delivery.getBody());
channel.basicAck(delivery.getEnvelope().getDeliveryTag(), false);
}, consumerTag -> {});
Prefetch = 1 — fair dispatch#
Each worker holds at most one message at a time. Whoever finishes first gets the next one.
Queue: [order_1][order_2][order_3][order_4][order_5]
Worker A gets order_1 (takes 500ms — it's a large order)
Worker B gets order_2 (takes 100ms — small order)
Worker C gets order_3 (takes 100ms — small order)
Worker B finishes first → gets order_4
Worker C finishes next → gets order_5
Worker A still processing order_1
Result: work distributed by actual throughput, not by who connected first
This is the right setting for jobs with variable processing time — order fulfilment, payment processing, video transcoding. Slow workers don't starve fast ones.
High prefetch — throughput over fairness#
With prefetch = 50, each worker grabs 50 messages at once. Fewer round trips to the broker, higher throughput.
Worker A: [order_1 ... order_50]
Worker B: [order_51 ... order_100]
Worker C: [order_101 ... order_150]
For small, fast, homogeneous jobs — like sending push notifications or writing log entries — this is fine. Each message takes milliseconds, workers process at similar speeds, the batching is worth it.
But for variable jobs, fairness breaks:
Worker A is slow, still holding order_15 through order_50
Worker B finished all 50, now idle
Worker C finished all 50, now idle
35 messages stuck in Worker A's buffer that B and C could finish in seconds
The crash burst problem#
High prefetch creates a second problem: when a worker crashes, all its unacked messages get redelivered at once.
prefetch = 100
Worker A crashes while holding 94 unacked messages
→ RabbitMQ redelivers all 94 to Workers B and C immediately
→ sudden spike in load on remaining workers
→ if they're already at capacity, queue backlog grows fast
With prefetch = 1, a crashed worker was holding exactly one message. The blast radius of a crash is one message, not a hundred.
The trade-off#
prefetch = 1
→ fair distribution by processing speed
→ crash loses at most 1 in-flight message per worker
→ more broker round trips (each ACK triggers next delivery)
→ lower throughput for fast, tiny jobs
prefetch = 10–50
→ better throughput for fast homogeneous jobs
→ some fairness lost when workers have uneven speed
→ crash redelivers up to N messages at once
prefetch = 0 (unlimited)
→ maximum throughput
→ complete hoarding, zero fairness
→ a single slow or crashed worker can hold the entire queue
No universal correct value. For order processing and payment flows: prefetch = 1. For notification dispatch or lightweight event fanout: prefetch = 10–50.
Prefetch is per-consumer, not per-queue. Setting basicQos(1) means this consumer holds at most 1 unacked message. Other consumers on the same queue have their own prefetch limit.
Interview framing: "I set prefetch based on job variability. For long-running or uneven jobs — order processing, payment flows — I use prefetch=1 so work is distributed by actual throughput, not by who connected first. For fast homogeneous jobs I can raise it to reduce broker round trips. The hidden cost of high prefetch is the crash burst — a crashed worker redelivers all its in-flight messages at once."