Skip to content

Lambda vs Kappa — Comparison#

Lambda runs batch + stream in parallel and merges results — accurate but operationally expensive (two codebases). Kappa runs stream only and replays history when needed — simpler but requires your stream processor to handle replay at scale. The choice comes down to whether batch-level accuracy is a hard requirement, or whether operational simplicity is worth more.

Side by Side#

Lambda Kappa
Pipelines Two (batch + stream) One (stream only)
Codebases Two One
Consistency Batch and stream can drift Single pipeline, always consistent
Accuracy Batch is exact, stream is approximate One logic, consistent accuracy
Latency Low (speed layer) + delayed correction (batch) Low (stream handles both)
Historical reprocessing Spark reruns on S3 Replay Kafka/S3 through stream processor
Operational complexity High — maintain two pipelines Low — one pipeline
Storage requirement S3 + Kafka S3 (source of truth) + Kafka
Use when Batch accuracy non-negotiable Simplicity matters, stream can handle replay

Decision Rule#

Use Lambda when: - Regulatory or financial accuracy is required (billing, compliance, tax) - Batch and stream results need to be independently verifiable - Your stream processor cannot replay years of history at scale

Use Kappa when: - You want one codebase and one pipeline to maintain - S3 is your source of truth and you can replay through Kafka - Your stream processor (Flink, Kafka Streams) is capable enough for both live and replay


Interview Answer Template#

"For the real-time dashboard I'd use a stream processor — Flink consuming from Kafka, results updated every second. For the monthly billing report I need exact numbers, so I'd run a nightly Spark batch job against the raw event log in S3. This is Lambda architecture — speed layer for low latency, batch layer for accuracy, serving layer merges both. If operational simplicity is a priority and we're okay with Kappa, we could drop the batch pipeline and replay historical events from S3 through the same stream processor using a new consumer group from offset 0."


Where These Appear in Interviews#

System Architecture
Ad Click Aggregation Lambda — real-time approximate counts + nightly exact reconciliation
Billing / Payments Lambda — batch for exact invoicing, stream for live alerts
News Feed Analytics Kappa — stream handles both live feed and historical replay
Fraud Detection Kappa — single stream processor, replay to retrain on historical patterns
Log Analytics Either — depends on accuracy requirements