Push Worker — Per-Channel Workers#
What Is a Push Notification?#
A push notification is a message delivered to a user's device by the operating system — not by your app directly. Your server never talks to the user's phone. Instead, you send the notification payload to an intermediary: Apple Push Notification Service (APNs) for iOS devices, and Firebase Cloud Messaging (FCM) for Android devices. These platforms maintain a persistent connection to every registered device and deliver the message on your behalf.
flowchart LR
W[Push Worker] --> APNs[APNs / FCM]
APNs --> D1[iPhone]
APNs --> D2[Android] This means your worker's job is simple: take a notification from Kafka, look up the device token for the user, and make an HTTP call to APNs or FCM. The hard part is doing this at 5M/sec.
Deriving the Numbers from APNs#
The bottleneck in the push worker is the network round trip to APNs. Let's derive how many workers and partitions we actually need from first principles.
APNs round trip latency: ~50ms
APNs supports HTTP/2, which allows multiplexing — multiple requests sent over a single connection simultaneously without waiting for previous responses. This is the key to high throughput without needing thousands of threads.
Concurrent requests per connection: HTTP/2 supports up to 1000 concurrent streams per connection. So one connection to APNs can have 1000 in-flight requests at the same time.
Connections per worker instance: A single worker instance can maintain 10 parallel connections to APNs:
Push volume — 70% of 5M/sec:
Worker instances needed at 3.5M/sec:
Partitions needed: Since one Kafka partition maps to one consumer instance, and we need 18 consumer instances:
Batch Size#
18 workers consuming 18 partitions need to drain 3.5M messages/sec total. Each worker handles:
Each worker reads in batches from Kafka. With 10 connections × 1000 concurrent requests = 10,000 in-flight APNs calls at any moment per worker. So the natural batch size is 10,000 messages per poll.
At 50ms per batch round trip:
This confirms the batch size of 10K matches the throughput target. No need for 10K threads — async I/O over HTTP/2 handles 10K concurrent in-flight requests with a small thread pool (200-500 threads), because threads spend most of their time waiting on network responses, not doing CPU work.
Async I/O vs Threads
You don't need 10K threads to process 10K concurrent requests. With async I/O and HTTP/2 multiplexing, one thread can have hundreds of in-flight requests simultaneously — it sends a request, registers a callback, and immediately picks up the next message instead of blocking. A thread pool of 200-500 threads per worker instance is sufficient to keep 10K requests in-flight at all times.
Deduplication — Bloom Filter#
The worker must deduplicate notifications before sending. At-least-once delivery from Kafka means the same message can be consumed twice on crash recovery. Sending a duplicate push notification is annoying but tolerable — but we should make a best-effort attempt to avoid it.
The naive approach is a DB lookup: "has notification_id been processed before?" At 5M/sec that's 5M DB reads/sec — same problem we've seen before.
The fix is a Bloom filter in Redis:
- Before sending, check if
notification_idexists in the bloom filter - If yes — likely a duplicate, skip it
- If no — send the notification, then add
notification_idto the bloom filter
Bloom filters have no false negatives (a processed notification is never missed) but can have false positives (occasionally a fresh notification is incorrectly flagged as duplicate). The false positive rate is tunable — at 1% false positive rate, 1 in 100 notifications might be skipped on a false flag. Acceptable for push notifications.
Why not Redis SET for deduplication?
A Redis SET storing every notification_id would grow unbounded — 5M entries/sec × 90 day retention = billions of keys. A bloom filter uses a fixed-size bit array regardless of how many items are added. Memory-efficient and O(1) lookups.
Full Push Worker Flow#
flowchart TD
K[Kafka - notifications-push] -->|batch of 10K| W[Push Worker]
W --> BF{Bloom Filter - Redis}
BF -->|duplicate| SKIP[Skip - commit offset]
BF -->|new| PREF[Check Preferences - Redis]
PREF -->|opted out| SKIP
PREF -->|opted in| TOKEN[Fetch Device Token]
TOKEN --> APNS[APNs / FCM - HTTP2]
APNS -->|success| DB[Update status DELIVERED - Cassandra]
APNS -->|failure| RETRY[Retry / DLQ]
DB --> ACK[Commit Kafka Offset] Step by step: 1. Worker polls Kafka for a batch of 10K messages 2. For each message, check bloom filter — skip if likely duplicate 3. Check Redis for user preferences — skip if user opted out of push 4. Fetch device token for the user (from a device registry, covered in deep dive) 5. Send async HTTP/2 request to APNs or FCM 6. On success — write DELIVERED status to Cassandra 7. On failure — send to retry queue / DLQ (covered in retry deep dive) 8. After full batch processed — commit Kafka offset
Summary#
| Property | Value |
|---|---|
| APNs latency | ~50ms |
| Concurrent requests per connection | 1000 |
| Throughput per connection | 20K/sec |
| Connections per worker | 10 |
| Throughput per worker | 200K/sec |
| Worker instances needed | 18 |
| Kafka partitions | 18 |
| Batch size | 10K messages |
| Deduplication | Bloom filter in Redis |