Skip to content

Conversations Table Schema

The conversations table — schema design for the inbox

One query must return Alice's full conversation list, sorted by most recent first. The schema must support that without N+1 lookups.


What the inbox query needs#

Alice opens WhatsApp. Her client needs:

For user_id = alice:
  → all conversation rows
  → sorted by last_message_timestamp, most recent first
  → paginated: top 20 first, more on scroll
  → each row contains: conv_id, last_message_preview, last_ts, unread_count

Everything needed to render the inbox in a single query. No follow-up lookups per row.


Schema design#

Table: conversations

PK (partition key) = user_id
SK (sort key)      = last_message_timestamp

Attributes:
  conv_id
  last_message_preview
  unread_count

The partition key is user_id alone — not user_id + conv_id. If you combined both into the partition key, you'd need to know the conv_id upfront to query the row, which defeats the purpose. You'd be back to N+1.

With PK = user_id, a single query fetches every row for Alice:

GET all rows WHERE PK = alice
ORDER BY SK DESC
LIMIT 20

Returns Alice's 20 most recent conversations in one round trip.


Why the sort key choice matters#

The sort key determines how rows are physically ordered within Alice's partition. DynamoDB (and Cassandra) store rows sorted by SK on disk — so the database can return them in order without scanning and sorting at query time.

If SK is last_message_timestamp, the most recent conversation is always at the top. The inbox query is a simple range scan from the top of Alice's partition.

This is the right shape for the read path. But the SK choice has a significant cost on the write path — covered in the next file.

Interview framing

"PK is user_id — gives us all conversations for a user in one query. SK is last_message_timestamp — gives us sort order for free. Attributes store the denormalized preview data so we never need to join to the messages table."