Skip to content

Entities and Relationships#

Extracting entities#

Entities come from the nouns in your functional requirements. Every noun that needs to be stored, queried, or managed is an entity — it becomes a table (SQL), a collection (MongoDB), or a partition (Cassandra).

For Instagram:

Noun Entity Table
User Person using the app users
Post A photo with caption posts
Comment Text reply on a post comments
Like A user liking a post likes
Follow A user following another follows

Photo is not a separate entity — it's an attribute of Post (stored as a URL pointing to S3). Feed is not a stored entity — it's a derived view (cached in Redis, not stored in SQL).


Mapping relationships#

Verbs in requirements become relationships. There are three types:

1:1 — one entity maps to exactly one of another - User → Profile (if profile is separate from user) - Usually just embed as columns in the same table

1:Many — one entity maps to many of another - User → Posts (one user has many posts) - Implemented with a foreign key on the "many" side: posts.user_id

Many:Many — neither side maps to just one of the other - User → Like → Post: one user can like many posts, and one post can be liked by many users - User → Follow → User: one user can follow many users, and one user can be followed by many users - Implemented with a junction table containing both foreign keys


Junction tables#

A junction table sits between two entities in a many-to-many relationship. It contains the primary keys of both entities as a composite primary key.

likes (
    user_id  UUID NOT NULL,
    post_id  UUID NOT NULL,
    PRIMARY KEY (user_id, post_id)    composite PK enforces uniqueness
)

The composite PK does two things: - Prevents duplicates — a user can't like the same post twice - Acts as an index — "has user_123 liked post_456?" is an O(1) lookup


Self-referential tables#

When an entity has a relationship with itself — like users following other users — you use a self-referential junction table. Both foreign keys reference the same table:

follows (
    follower_id  UUID NOT NULL,   -- the person who follows
    following_id UUID NOT NULL,   -- the person being followed
    created_at   TIMESTAMP,

    PRIMARY KEY (follower_id, following_id),
    FOREIGN KEY (follower_id)  REFERENCES users(user_id),
    FOREIGN KEY (following_id) REFERENCES users(user_id)
)

Both columns point to users. The naming makes the direction of the relationship explicit — follower_id is doing the following, following_id is being followed.

Here's what the table looks like with real data:

follower_id  | following_id | created_at
-------------|--------------|------------
user_1       | user_2       | 2024-01-01    ← user_1 follows user_2
user_1       | user_3       | 2024-01-02    ← user_1 follows user_3
user_2       | user_1       | 2024-01-03    ← user_2 follows user_1
user_3       | user_1       | 2024-01-04    ← user_3 follows user_1

Notice that user_1 appears multiple times in follower_id. This is why user_id alone can never be the primary key — a PK must be unique per row, and no single column here is unique.

But the combination of (follower_id, following_id) is always unique — a user can only follow someone once. That composite PK also enforces this business rule for free:

-- Try to make user_1 follow user_2 again
INSERT INTO follows (follower_id, following_id) VALUES ('user_1', 'user_2');
-- ERROR: duplicate key violates primary key constraint ✗

No extra application code needed — the DB rejects the duplicate automatically.


Always think about both directions of a junction table#

Every junction table has two read directions. Both need to be fast.

For the follows table:

-- Direction 1: who does user_123 follow?
SELECT following_id FROM follows WHERE follower_id = '123';
-- → hits composite PK index (starts with follower_id) ✓

-- Direction 2: who follows user_123?
SELECT follower_id FROM follows WHERE following_id = '123';
-- → composite PK starts with follower_id, not following_id → full scan ✗

Direction 2 can't use the PK index. You need a separate index:

CREATE INDEX idx_follows_following ON follows(following_id);

Junction table rule

Whenever you create a junction table, immediately ask yourself: "do I need to query this in both directions?" If yes — you need an index for each direction. The composite PK covers one direction. Add a separate index for the other.