Performance Metrics — SDE-1 Interview Questions#
These are foundational questions testing basic understanding of latency, throughput, bandwidth and percentiles. Every SDE candidate is expected to answer these confidently.
What is latency? Give me a one-line definition and a real-world example.
Answer
Latency is the total time from the moment a request leaves the client to when the response arrives back.
Real-world example: You click "Pay" on Amazon. The time between your click and the "Order Confirmed" page appearing = latency.
Note
Latency can be measured one-way (client → server) or round-trip. In practice, round-trip latency is what matters because the user waits for the full response.
What is the difference between throughput and bandwidth?
Answer
| Definition | Unit | |
|---|---|---|
| Throughput | Number of requests completed per second | RPS / QPS |
| Bandwidth | Total data transferred per second | Mbps / Gbps |
Example:
- A server handling 10,000 API requests/second = throughput
- A video stream consuming 25 Mbps of network capacity = bandwidth
Rule of thumb
Bandwidth matters when data size is large (video, files). Throughput matters when request volume is large (APIs, messaging).
What is P99 latency and why do we care about it more than average latency?
Answer
P99 = 99% of requests complete within this time. The remaining 1% take longer.
Why averages lie:
9 requests × 10ms = 90ms
1 request × 500ms = 500ms
──────────────────────────
Average = 59ms ← looks fine
P99 = 500ms ← 1 in 100 users waiting 500ms
At scale — 10M requests/day → 1% = 100,000 users having a bad experience daily. Average hides every one of them.
Averages hide outliers. Percentiles expose them.
Which percentile would you use for a payment system vs a social media feed? Why?
Answer
| System | Percentile | Reason |
|---|---|---|
| Payment | P99.9 | Money involved — even 1 in 1000 slow transactions is unacceptable at scale |
| Social feed | P95 | Occasional slow feed refresh is annoying but not damaging |
The rule: The more money or trust involved, the higher the percentile you target.
Interview framing
"For payments I'd target P99.9 — at our scale even 0.1% of failed or slow transactions represents thousands of users losing trust. For a social feed P95 is the right balance between user experience and engineering cost."
What is the latency vs throughput tradeoff? Give a simple example of when optimizing one hurts the other.
Answer
Optimizing for throughput often increases latency for individual requests, and vice versa.
Classic example — batching:
Without batching:
Each DB write executes immediately
Latency: 5ms per request ✓
Throughput: limited by individual write overhead
With batching:
100 writes collected → executed together
Latency: each request waits for batch to fill → 50-100ms ✗
Throughput: 10x improvement ✓
Batching dramatically improves throughput (more work per second) but each individual user waits longer.
Other examples
- Compression — reduces bandwidth (good) but adds CPU time (increases latency)
- Connection pooling — improves throughput but adds queue wait time under load