RESOURCE_EXHAUSTED api resource_error ai_generated partial

gRPC error: RESOURCE_EXHAUSTED: rate limit exceeded

ID: api/grpc-resource-exhausted-rate-limit

Also available as: JSON · Markdown · 中文

82%Fix Rate

88%Confidence

1Evidence

2024-06-10First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
gRPC 1.62.0	active	—	—	—
Envoy 1.29.0	active	—	—	—
Istio 1.20.0	active	—	—	—
gRPC-Go 1.62.0	active	—	—	—

Root Cause

The gRPC server rejected the request because the client exceeded the configured rate limit, often due to aggressive retries or high concurrency.

generic

中文

gRPC 服务器因客户端超出配置的速率限制而拒绝请求，通常由于激进的重试或高并发导致。

Official Documentation

https://grpc.github.io/grpc/core/md_doc_statuscodes.html

Workarounds

85% success Implement exponential backoff with jitter in the gRPC client. Example in Go: `backoff := grpc.WithBackoffMaxDelay(5 * time.Second); conn, err := grpc.Dial(target, backoff)`.
```
Implement exponential backoff with jitter in the gRPC client. Example in Go: `backoff := grpc.WithBackoffMaxDelay(5 * time.Second); conn, err := grpc.Dial(target, backoff)`.
```
80% success Reduce concurrency by limiting the number of in-flight gRPC streams using a semaphore or connection pool. For example, use a channel-based worker pool in Go.
```
Reduce concurrency by limiting the number of in-flight gRPC streams using a semaphore or connection pool. For example, use a channel-based worker pool in Go.
```
70% success Contact the API provider to request a higher rate limit or check the Retry-After header in the gRPC trailing metadata for a suggested wait time.
```
Contact the API provider to request a higher rate limit or check the Retry-After header in the gRPC trailing metadata for a suggested wait time.
```

中文步骤

Implement exponential backoff with jitter in the gRPC client. Example in Go: `backoff := grpc.WithBackoffMaxDelay(5 * time.Second); conn, err := grpc.Dial(target, backoff)`.

Reduce concurrency by limiting the number of in-flight gRPC streams using a semaphore or connection pool. For example, use a channel-based worker pool in Go.

Contact the API provider to request a higher rate limit or check the Retry-After header in the gRPC trailing metadata for a suggested wait time.

Dead Ends

Common approaches that don't work:

80% fail
Without exponential backoff, retries are sent immediately, consuming more quota and causing cascading failures.
90% fail
Removing rate limits exposes the server to abuse and performance degradation.
60% fail
Rate limits are often enforced at the service level regardless of protocol; REST endpoints may have similar or stricter limits.