RESOURCE_EXHAUSTED api resource_error ai_generated partial

gRPC error: RESOURCE_EXHAUSTED: rate limit exceeded

ID: api/grpc-resource-exhausted-rate-limit

Also available as: JSON · Markdown · 中文
82%Fix Rate
88%Confidence
1Evidence
2024-06-10First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
gRPC 1.62.0 active
Envoy 1.29.0 active
Istio 1.20.0 active
gRPC-Go 1.62.0 active

Root Cause

The gRPC server rejected the request because the client exceeded the configured rate limit, often due to aggressive retries or high concurrency.

generic

中文

gRPC 服务器因客户端超出配置的速率限制而拒绝请求,通常由于激进的重试或高并发导致。

Official Documentation

https://grpc.github.io/grpc/core/md_doc_statuscodes.html

Workarounds

  1. 85% success Implement exponential backoff with jitter in the gRPC client. Example in Go: `backoff := grpc.WithBackoffMaxDelay(5 * time.Second); conn, err := grpc.Dial(target, backoff)`.
    Implement exponential backoff with jitter in the gRPC client. Example in Go: `backoff := grpc.WithBackoffMaxDelay(5 * time.Second); conn, err := grpc.Dial(target, backoff)`.
  2. 80% success Reduce concurrency by limiting the number of in-flight gRPC streams using a semaphore or connection pool. For example, use a channel-based worker pool in Go.
    Reduce concurrency by limiting the number of in-flight gRPC streams using a semaphore or connection pool. For example, use a channel-based worker pool in Go.
  3. 70% success Contact the API provider to request a higher rate limit or check the Retry-After header in the gRPC trailing metadata for a suggested wait time.
    Contact the API provider to request a higher rate limit or check the Retry-After header in the gRPC trailing metadata for a suggested wait time.

中文步骤

  1. Implement exponential backoff with jitter in the gRPC client. Example in Go: `backoff := grpc.WithBackoffMaxDelay(5 * time.Second); conn, err := grpc.Dial(target, backoff)`.
  2. Reduce concurrency by limiting the number of in-flight gRPC streams using a semaphore or connection pool. For example, use a channel-based worker pool in Go.
  3. Contact the API provider to request a higher rate limit or check the Retry-After header in the gRPC trailing metadata for a suggested wait time.

Dead Ends

Common approaches that don't work:

  1. 80% fail

    Without exponential backoff, retries are sent immediately, consuming more quota and causing cascading failures.

  2. 90% fail

    Removing rate limits exposes the server to abuse and performance degradation.

  3. 60% fail

    Rate limits are often enforced at the service level regardless of protocol; REST endpoints may have similar or stricter limits.