Series vs Parallel Availability#
You know the availability of each component. But what's the availability of the whole system?
It depends on whether components are in series or parallel — and the math is completely different.
Components in Series#
Series means a request must go through all components to succeed. If any one fails, the whole request fails.
A typical request flow:
All three must be working. If the database goes down, it doesn't matter that the load balancer and app server are fine — the request fails.
Why you multiply#
For the request to succeed, the load balancer must work AND the app server must work AND the database must work.
AND conditions = multiply probabilities. This is basic probability.
Why not add? Adding 99.9% + 99.9% + 99.9% = 299.7% — over 100%, which makes no sense for a probability.
Overall Availability = A × B × C
Load Balancer: 99.9% → 0.999
App Server: 99.9% → 0.999
Database: 99.9% → 0.999
Overall = 0.999 × 0.999 × 0.999 = 99.7%
Each component added in series makes the system less available
Three components each at 99.9% — combined they give only 99.7%. The chain is only as strong as its weakest link.
Components in Parallel#
Parallel means multiple copies of the same component. The request only needs one of them to succeed.
Example — two app servers. If one dies, the other handles the request.
Why you can't just add#
Adding 99.9% + 99.9% = 199.8% — again over 100%. Adding probabilities only works for mutually exclusive events.
The correct approach — calculate from the failure side#
The only way a parallel system fails is if every single copy fails simultaneously.
Chance Server A fails = 0.001
Chance Server B fails = 0.001
Chance BOTH fail simultaneously = 0.001 × 0.001 = 0.000001
Overall Availability = 1 - 0.000001 = 99.9999%
The 1 - means: "everything except the case where all copies fail."
Two servers each at 99.9% — combined in parallel they give 99.9999%.
This is the mathematical proof of why redundancy works
Adding a second server takes you from 99.9% to 99.9999%. That's the difference between 43 minutes and 26 seconds of downtime per month.
The Real World Nuance — Theoretical vs Observed Availability#
The calculations above are theoretical — they assume the load balancer detects failure instantly and routes perfectly.
In reality, availability is slightly lower because of:
- Detection delay — load balancer health checks run every 5-30 seconds. In that window, some requests get routed to a dead server
- In-flight failures — a server can die mid-request, after already receiving it. That specific request fails even though the load balancer didn't know yet
- Retry overhead — the client retries, the retry succeeds, but the user still experienced a failure
This is why you measure availability from the user's perspective — not from component uptime
"Our servers were up 99.99% of the time" — theoretical, measured from component health
"99.98% of requests succeeded" — observed, what users actually experienced
The SLI (request success rate) is what actually matters in production. The calculations are useful for architecture planning and comparing designs — not for production monitoring.
Combining Both — A Real System#
Most real systems have series AND parallel components together:
Users → Load Balancer (×2, parallel) → App Servers (×3, parallel) → Database (primary + replica, parallel)
Calculate step by step: 1. Two load balancers in parallel → very high availability 2. Three app servers in parallel → very high availability 3. Primary + replica DB in parallel → very high availability 4. All three tiers in series → multiply them together
In an interview
When asked about availability, identify which components are in series and which are parallel. Then explain: "each tier in series reduces overall availability, which is why we add redundancy within each tier — the parallel redundancy compensates for the series penalty."