# 504 Gateway Timeout: upstream connection pool exhausted

- **ID:** `api/http-504-gateway-timeout-upstream-connection-pool-exhausted`
- **Domain:** api
- **Category:** resource_error
- **Verification:** ai_generated
- **Fix Rate:** 85%

## Root Cause

The API gateway's connection pool to the upstream service is fully utilized, causing new requests to queue and eventually time out, often due to slow upstream responses or insufficient pool size.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| Nginx 1.26+ | active | — | — |
| Envoy 1.30+ | active | — | — |
| AWS ALB (2024) | active | — | — |

## Workarounds

1. **Increase the upstream connection pool size. Example Nginx: `upstream backend { server 10.0.1.5:8080; keepalive 100; }` (increase from default 32).** (90% success)
   ```
   Increase the upstream connection pool size. Example Nginx: `upstream backend { server 10.0.1.5:8080; keepalive 100; }` (increase from default 32).
   ```
2. **Optimize upstream response time by adding caching, reducing database queries, or scaling upstream instances. Monitor upstream latency with tools like `nginx_upstream_check_module`.** (85% success)
   ```
   Optimize upstream response time by adding caching, reducing database queries, or scaling upstream instances. Monitor upstream latency with tools like `nginx_upstream_check_module`.
   ```
3. **Implement connection pooling limits per client IP or rate limiting at the gateway to prevent abuse. Example Nginx: `limit_conn_zone $binary_remote_addr zone=addr:10m; limit_conn addr 10;`.** (80% success)
   ```
   Implement connection pooling limits per client IP or rate limiting at the gateway to prevent abuse. Example Nginx: `limit_conn_zone $binary_remote_addr zone=addr:10m; limit_conn addr 10;`.
   ```

## Dead Ends

- **** — The pool will quickly exhaust again if the upstream is slow or the pool size is too small; it's a temporary fix. (90% fail)
- **** — Timeouts only delay the error; the pool remains exhausted and requests still queue. (80% fail)
- **** — The error is about connection pool capacity, not network connectivity. (70% fail)
