Interview Cheatsheet

The cost of a fresh connection#

TCP handshake:    ~3ms   (3 round trips: SYN → SYN-ACK → ACK)
TLS handshake:    ~2ms   (agree on encryption algorithm + session key)
DB auth:          ~1ms   (username/password check, permissions lookup)
Memory (Postgres): ~8MB  (dedicated OS process per connection)
─────────────────────────────────────────────────────────────────
Total per connection: 6-10ms + 5-10MB RAM

Why is opening a DB connection on every request a problem at scale?

Success

Each connection requires TCP handshake (~3ms), TLS negotiation (~2ms), DB auth (~1ms), and Postgres allocates ~8MB RAM for a dedicated process. At 10,000 RPS, that's 80GB RAM just for connection overhead, plus constant context-switching between thousands of OS processes — the DB never gets to run actual queries.

Interview framing

"Raw connections are expensive — TCP, TLS, auth, and Postgres spawns a full OS process per connection consuming ~8MB RAM. At 10k RPS that's 80GB RAM just for overhead. A connection pool opens a fixed set of connections at startup and reuses them — the setup cost is paid once, not per request."

How do you decide how many connections to put in the pool?

Success

Pool size ≈ DB CPU cores × 2. A DB with 4 cores can run 4 queries in parallel. Beyond ~8-10 connections, extra processes just context-switch against each other — throughput drops. The ×2 accounts for queries waiting on disk I/O, freeing up a core for another query.

More connections is NOT always better. At 1,000 connections on a 4-core DB, the OS spends more time context-switching between 1,000 processes than actually running queries.

Interview framing

"Pool size ≈ DB CPU cores × 2. Beyond that, extra connections add context-switching overhead without adding throughput — you have 4 cores regardless of how many processes are waiting."

Too small vs too large#

Too small → requests queue → response times spike → timeouts → user errors
Too large → context-switching overhead → DB throughput drops → same result
Sweet spot → CPU cores × 2 per app server

Tools#

PgBouncer   → Postgres, infrastructure-level proxy pooler
HikariCP    → Java apps, library-level pooler
RDS Proxy   → AWS managed, sits in front of RDS/Aurora

When to mention in interviews#

"How do you handle 10,000 concurrent users hitting the DB?"
Any time DB becomes a bottleneck under high concurrency
Alongside: read replicas (offload reads), caching (avoid DB entirely for hot data)

Quick flow#

sequenceDiagram
    participant Request
    participant Pool
    participant DB
    Note over Pool,DB: At startup — connections opened once
    Request->>Pool: I need a connection
    Pool->>Request: Here's conn #4 (already open, warm)
    Request->>DB: SELECT ...
    DB->>Request: Results
    Request->>Pool: Done, returning conn #4
    Note over Pool: conn #4 ready for next request