Schema Registry
What It Is#
A Schema Registry is a central store for message schemas. It acts as a gatekeeper — producers must register and validate their schema before publishing. Consumers fetch the schema to deserialize messages.
Incompatible schema changes are rejected at registration time — before they ever reach Kafka.
How It Works#
Instead of embedding the full schema in every message (expensive), the message carries only a tiny schema ID:
Consumer receives the message, looks up schema 101 from the registry, and deserializes.
Full Flow: v1 → v2 Schema Change#
Setup:
Step 1 — Producer registers schema v1:
Registry assigns schema ID = 101.Step 2 — Producer publishes:
Step 3 — Consumer receives:
Producer adds currency field — schema v2:
Step 4 — Producer registers schema v2:
Registry checks compatibility: - Old consumer reads new message → field 3 missing → uses default ✅ - New consumer reads old message → field 3 unknown → ignores ✅
Compatible → schema ID = 102 assigned.
Step 5 — Producer publishes:
Step 6 — Old consumer (not yet updated) receives:
Fetches schema 102 from registry
Reads field 1 → order_id=124 ✅
Reads field 2 → amount=75.0 ✅
Field 3 unknown → ignores ✅
No crash
Registry Rejects Incompatible Changes#
Producer tries to change field 2 type from float to int32:
Producer cannot publish until they fix the schema. Bad changes never reach Kafka.
Key Properties#
| Property | Detail |
|---|---|
| Schema storage | Central registry, not in messages |
| Message overhead | Only schema ID (4 bytes) per message |
| Compatibility check | Enforced at registration, not at runtime |
| Consumer lookup | Fetch schema by ID, cache locally |
| Popular implementations | Confluent Schema Registry (for Kafka + Avro/Protobuf) |