networking protocol_error ai_generated partial

HTTP/2: GOAWAY frame received from server with error code NO_ERROR, but connection closed before graceful shutdown completed

ID: networking/http2-graceful-shutdown-timeout

Also available as: JSON · Markdown · 中文
80%Fix Rate
85%Confidence
1Evidence
2024-04-05First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
nginx 1.25+ active
Apache HTTP Server 2.4.58+ active
gRPC 1.60+ active

Root Cause

An HTTP/2 graceful shutdown timeout occurs when the client receives a GOAWAY frame from the server but fails to complete pending streams within the allowed time, resulting in abrupt connection termination.

generic

中文

HTTP/2 优雅关闭超时发生在客户端收到服务器发送的 GOAWAY 帧但未能在允许时间内完成待处理流时,导致连接突然终止。

Official Documentation

https://httpwg.org/specs/rfc7540.html#GOAWAY

Workarounds

  1. 85% success Configure the client to gracefully handle GOAWAY by setting a reasonable grace period: In nginx, add 'http2_recv_timeout 30s;' to the server block to allow pending streams to complete.
    Configure the client to gracefully handle GOAWAY by setting a reasonable grace period: In nginx, add 'http2_recv_timeout 30s;' to the server block to allow pending streams to complete.
  2. 80% success Implement retry logic in the client for failed requests after GOAWAY: In gRPC, use WithWaitForHandshake and retry with backoff.
    Implement retry logic in the client for failed requests after GOAWAY: In gRPC, use WithWaitForHandshake and retry with backoff.
  3. 75% success Reduce the number of concurrent streams per connection to limit pending work during shutdown: In nginx, set 'http2_max_concurrent_streams 64;'.
    Reduce the number of concurrent streams per connection to limit pending work during shutdown: In nginx, set 'http2_max_concurrent_streams 64;'.

中文步骤

  1. Configure the client to gracefully handle GOAWAY by setting a reasonable grace period: In nginx, add 'http2_recv_timeout 30s;' to the server block to allow pending streams to complete.
  2. Implement retry logic in the client for failed requests after GOAWAY: In gRPC, use WithWaitForHandshake and retry with backoff.
  3. Reduce the number of concurrent streams per connection to limit pending work during shutdown: In nginx, set 'http2_max_concurrent_streams 64;'.

Dead Ends

Common approaches that don't work:

  1. 80% fail

    增加服务器上的 HTTP/2 流超时(例如 http2_max_streams)无法解决客户端无法快速处理待处理请求的问题。

  2. 90% fail

    完全禁用 HTTP/2 并回退到 HTTP/1.1 可以避免问题,但会失去性能优势,且不是长期解决方案。

  3. 85% fail

    重启负载均衡器或代理可能会清空连接池,但如果应用程序未正确处理 GOAWAY,超时会再次发生。