UNAVAILABLE communication network_error ai_generated partial

grpc::UNAVAILABLE:没有健康的上游端点

grpc::UNAVAILABLE: No healthy upstream endpoints

ID: communication/grpc-unavailable-no-healthy-upstream

其他格式: JSON · Markdown 中文 · English
82%修复率
88%置信度
1证据数
2023-06-15首次发现

版本兼容性

版本状态引入弃用备注
gRPC 1.48 active
Envoy 1.26 active
Kubernetes 1.28 active
Istio 1.18 active

根因分析

gRPC 客户端无法连接,因为负载均衡器或服务注册中心报告目标服务没有健康的后端实例。

English

gRPC client fails to connect because the load balancer or service registry reports zero healthy backends for the target service.

generic

官方文档

https://grpc.io/docs/guides/error-handling/

解决方案

  1. Verify backend health via `kubectl get endpoints -n <namespace> <service-name>` or equivalent service registry query. Then restart unhealthy pods: `kubectl rollout restart deployment/<deployment-name> -n <namespace>`.
  2. Add a retry with backoff in the gRPC client using a middleware like `grpc_retry` in Go: `import "github.com/grpc-ecosystem/go-grpc-middleware/retry"; opts := []grpc_retry.CallOption{grpc_retry.WithMax(3), grpc_retry.WithBackoff(grpc_retry.BackoffLinear(100 * time.Millisecond))}`
  3. Increase the readiness probe threshold in the Kubernetes deployment spec: `readinessProbe.periodSeconds: 10` and `failureThreshold: 5` to allow slower-starting backends more time to become healthy.

无效尝试

常见但无效的做法:

  1. Restart the client application to force a new connection 95% 失败

    Restarting the gRPC client does not fix the root cause of unhealthy backends; the client will re-encounter the same error until the backend pool recovers.

  2. Disable TLS/SSL on the gRPC channel 85% 失败

    Disabling TLS removes encryption but does not address backend health; the error stems from upstream unavailability, not protocol negotiation.

  3. Change the target port to 443 or another arbitrary number 90% 失败

    Changing to a random port bypasses the correct service endpoint, making the situation worse by connecting to a non-existent service.