Batching

Kafka achieves massive scale not just by where it sends data, but how it sends it. By grouping messages into Batches, Kafka reduces the overhead on the CPU, the Network Card (NIC), and the Disk.

The "Postal Service" Problem (Overhead)#

Imagine you have 100 letters to send. - Individual Sending: Driving to the post office 100 times. You spend more time "driving" (network overhead) than "mailing" (data transfer). - Batch Sending: Waiting until you have 100 letters, putting them in one box, and making one trip.

At Google scale (100,000 clicks/sec), sending individual messages would bombard the Network Card with too many "packets per second," causing the system to choke.

How Batching Works in the Producer#

When your code calls producer.send(), the Kafka Library doesn't send it immediately. It places the message in a local memory buffer (RAM).

The Producer sends the batch when: 1. Batch Size is reached: (e.g., the "box" hits 16 KB). 2. Linger Time is reached: (e.g., the producer has waited 5ms, and the box isn't full yet, but we can't wait any longer).

graph LR
    App[Your App Code] -->|send| Buffer[Producer RAM Buffer]
    Buffer -->|Wait 5ms or 16KB| Batch[Full Batch]
    Batch -->|One Network Trip| Broker[Kafka Broker]

The Trade-off: Throughput vs. Durability#

Batching is a "speed vs. safety" decision.

1. High Throughput (Ad Clicks)#

Settings: linger.ms = 5, batch.size = 16KB.
The Risk: If the App Server crashes while 9 clicks are in RAM, they are lost forever.
The Reward: We can handle 100,000 clicks/sec because the Network and CPU are relaxed.

2. High Durability (Bank Transfers)#

Settings: linger.ms = 0.
The Risk: System is slower and harder to scale.
The Reward: Every single cent is sent to Kafka the millisecond it happens. No data is left sitting in volatile RAM.

Why this doesn't "Bombard" your RAM#

A common concern is that batching will eat up all your server's memory. However, the math shows otherwise:

100 clicks × 1 KB each = 100 KB.
Even with 1,000 active batches, that's only 100 MB of RAM.
Most modern servers have 16 GB to 64 GB of RAM. 100 MB is a tiny fraction.

The real bottleneck at scale is almost never RAM—it is the Network Card (NIC) and the CPU interrupts. Batching uses a tiny bit of "Warehouse Space" (RAM) to save the "Warehouse Door" (Network) from being overwhelmed.

Interview framing: "To handle 100k events/sec, I'd enable batching on the producer by tuning linger.ms and batch.size. This trades a few milliseconds of latency and a small amount of RAM for a massive increase in throughput by reducing the number of network round-trips and disk I/O operations on the broker."