elasticsearch
resource_error
ai_generated
true
ElasticsearchException: [index][0] recovery throttled due to max_bytes_per_sec [40mb]
ID: elasticsearch/max-bytes-per-sec-throttle
78%Fix Rate
85%Confidence
1Evidence
2024-06-15First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| elasticsearch 7.17.0 | active | — | — | — |
| elasticsearch 8.10.0 | active | — | — | — |
| elasticsearch 8.6.2 | active | — | — | — |
Root Cause
Shard recovery is throttled because the node's max_bytes_per_sec setting is too low for the current recovery load.
generic中文
分片恢复被限流,因为节点的max_bytes_per_sec设置对于当前恢复负载过低。
Official Documentation
https://www.elastic.co/guide/en/elasticsearch/reference/current/recovery.htmlWorkarounds
-
85% success Increase max_bytes_per_sec temporarily using dynamic cluster settings: PUT _cluster/settings { "transient": { "indices.recovery.max_bytes_per_sec": "100mb" } }
Increase max_bytes_per_sec temporarily using dynamic cluster settings: PUT _cluster/settings { "transient": { "indices.recovery.max_bytes_per_sec": "100mb" } } -
75% success Reduce concurrent recoveries by setting indices.recovery.concurrent_streams to 2 in elasticsearch.yml to throttle recovery parallelism.
Reduce concurrent recoveries by setting indices.recovery.concurrent_streams to 2 in elasticsearch.yml to throttle recovery parallelism.
-
90% success Allocate more dedicated hot nodes to spread recovery load, then rebalance replicas.
Allocate more dedicated hot nodes to spread recovery load, then rebalance replicas.
中文步骤
临时增加max_bytes_per_sec:PUT _cluster/settings { "transient": { "indices.recovery.max_bytes_per_sec": "100mb" } }减少并发恢复数:在elasticsearch.yml中设置indices.recovery.concurrent_streams为2以限制恢复并行度。
分配更多专用热节点以分散恢复负载,然后重新平衡副本。
Dead Ends
Common approaches that don't work:
-
45% fail
This can saturate network bandwidth and cause other node failures, especially in clusters with many concurrent recoveries.
-
70% fail
Restarting only resets the recovery process temporarily; throttling reapplies once recovery resumes.
-
65% fail
Setting to 0 disables throttling but can overwhelm the node's I/O, leading to OOM or disk saturation.