elasticsearch resource_error ai_generated true

ES 拒绝执行异常:拒绝执行协调操作 [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, source=bulk]

EsRejectedExecutionException: rejected execution of coordinating operation [coordinating_and_primary_bytes=0, replica_bytes=0, all_bytes=0, source=bulk]

ID: elasticsearch/too-many-requests-bulk-queue

其他格式: JSON · Markdown 中文 · English
80%修复率
90%置信度
1证据数
2024-11-05首次发现

版本兼容性

版本状态引入弃用备注
elasticsearch 7.17 active
elasticsearch 8.10 active
elasticsearch 8.12 active

根因分析

由于高索引吞吐量,协调节点上的批量队列已满,导致新的批量请求被拒绝。

English

The bulk queue on the coordinating node is full due to high indexing throughput, causing new bulk requests to be rejected.

generic

官方文档

https://www.elastic.co/guide/en/elasticsearch/reference/current/rejected-execution.html

解决方案

  1. 在客户端实现指数退避重试:例如,使用 elasticsearch-py:from elasticsearch import Elasticsearch; from time import sleep; es = Elasticsearch(); for attempt in range(5): try: es.bulk(body=docs); break except Exception as e: sleep(2 ** attempt)
  2. 临时增加批量队列大小:PUT _cluster/settings { "transient": { "thread_pool.bulk.queue_size": 2000 } }
  3. 通过添加更多节点或增加堆大小来扩展协调节点,以处理更高的吞吐量。

无效尝试

常见但无效的做法:

  1. 55% 失败

    Large queue sizes can lead to high memory usage and increased latency, potentially causing OOM or degraded performance.

  2. 75% 失败

    Without retries, bulk requests are lost permanently, leading to data loss and incomplete indexing.

  3. 80% 失败

    Fewer nodes can increase per-node load and worsen the queue pressure, making the error more frequent.