# grpc::UNAVAILABLE: No healthy upstream endpoints

- **ID:** `communication/grpc-unavailable-no-healthy-upstream`
- **Domain:** communication
- **Category:** network_error
- **Error Code:** `UNAVAILABLE`
- **Verification:** ai_generated
- **Fix Rate:** 82%

## Root Cause

gRPC client fails to connect because the load balancer or service registry reports zero healthy backends for the target service.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| gRPC 1.48 | active | — | — |
| Envoy 1.26 | active | — | — |
| Kubernetes 1.28 | active | — | — |
| Istio 1.18 | active | — | — |

## Workarounds

1. **Verify backend health via `kubectl get endpoints -n <namespace> <service-name>` or equivalent service registry query. Then restart unhealthy pods: `kubectl rollout restart deployment/<deployment-name> -n <namespace>`.** (75% success)
   ```
   Verify backend health via `kubectl get endpoints -n <namespace> <service-name>` or equivalent service registry query. Then restart unhealthy pods: `kubectl rollout restart deployment/<deployment-name> -n <namespace>`.
   ```
2. **Add a retry with backoff in the gRPC client using a middleware like `grpc_retry` in Go: `import "github.com/grpc-ecosystem/go-grpc-middleware/retry"; opts := []grpc_retry.CallOption{grpc_retry.WithMax(3), grpc_retry.WithBackoff(grpc_retry.BackoffLinear(100 * time.Millisecond))}`** (70% success)
   ```
   Add a retry with backoff in the gRPC client using a middleware like `grpc_retry` in Go: `import "github.com/grpc-ecosystem/go-grpc-middleware/retry"; opts := []grpc_retry.CallOption{grpc_retry.WithMax(3), grpc_retry.WithBackoff(grpc_retry.BackoffLinear(100 * time.Millisecond))}`
   ```
3. **Increase the readiness probe threshold in the Kubernetes deployment spec: `readinessProbe.periodSeconds: 10` and `failureThreshold: 5` to allow slower-starting backends more time to become healthy.** (65% success)
   ```
   Increase the readiness probe threshold in the Kubernetes deployment spec: `readinessProbe.periodSeconds: 10` and `failureThreshold: 5` to allow slower-starting backends more time to become healthy.
   ```

## Dead Ends

- **Restart the client application to force a new connection** — Restarting the gRPC client does not fix the root cause of unhealthy backends; the client will re-encounter the same error until the backend pool recovers. (95% fail)
- **Disable TLS/SSL on the gRPC channel** — Disabling TLS removes encryption but does not address backend health; the error stems from upstream unavailability, not protocol negotiation. (85% fail)
- **Change the target port to 443 or another arbitrary number** — Changing to a random port bypasses the correct service endpoint, making the situation worse by connecting to a non-existent service. (90% fail)
