Performance Metrics — SDE-1 Interview Questions#

These are foundational questions testing basic understanding of latency, throughput, bandwidth and percentiles. Every SDE candidate is expected to answer these confidently.

What is latency? Give me a one-line definition and a real-world example.

Answer

Latency is the total time from the moment a request leaves the client to when the response arrives back.

Real-world example: You click "Pay" on Amazon. The time between your click and the "Order Confirmed" page appearing = latency.

Note

Latency can be measured one-way (client → server) or round-trip. In practice, round-trip latency is what matters because the user waits for the full response.

What is the difference between throughput and bandwidth?

Answer

	Definition	Unit
Throughput	Number of requests completed per second	RPS / QPS
Bandwidth	Total data transferred per second	Mbps / Gbps

Example:

A server handling 10,000 API requests/second = throughput
A video stream consuming 25 Mbps of network capacity = bandwidth

Rule of thumb

Bandwidth matters when data size is large (video, files). Throughput matters when request volume is large (APIs, messaging).

What is P99 latency and why do we care about it more than average latency?

Answer

P99 = 99% of requests complete within this time. The remaining 1% take longer.

Why averages lie:

9 requests × 10ms = 90ms
1 request  × 500ms = 500ms
──────────────────────────
Average = 59ms  ← looks fine
P99     = 500ms ← 1 in 100 users waiting 500ms

At scale — 10M requests/day → 1% = 100,000 users having a bad experience daily. Average hides every one of them.

Averages hide outliers. Percentiles expose them.

Which percentile would you use for a payment system vs a social media feed? Why?

Answer

System	Percentile	Reason
Payment	P99.9	Money involved — even 1 in 1000 slow transactions is unacceptable at scale
Social feed	P95	Occasional slow feed refresh is annoying but not damaging

The rule: The more money or trust involved, the higher the percentile you target.

Interview framing

"For payments I'd target P99.9 — at our scale even 0.1% of failed or slow transactions represents thousands of users losing trust. For a social feed P95 is the right balance between user experience and engineering cost."

What is the latency vs throughput tradeoff? Give a simple example of when optimizing one hurts the other.

Answer

Optimizing for throughput often increases latency for individual requests, and vice versa.

Classic example — batching:

Without batching:
  Each DB write executes immediately
  Latency: 5ms per request ✓
  Throughput: limited by individual write overhead

With batching:
  100 writes collected → executed together
  Latency: each request waits for batch to fill → 50-100ms ✗
  Throughput: 10x improvement ✓

Batching dramatically improves throughput (more work per second) but each individual user waits longer.

Other examples

Compression — reduces bandwidth (good) but adds CPU time (increases latency)
Connection pooling — improves throughput but adds queue wait time under load