持久化任务异常:任务 [cluster:admin/persistent/assignment] 在 [5] 次尝试后未能将任务 [task_id_123] 分配给节点 [node-1]
PersistentTaskException: task [cluster:admin/persistent/assignment] failed to assign task [task_id_123] to node [node-1] after [5] attempts
ID: elasticsearch/persistent-task-assignment-failure
版本兼容性
| 版本 | 状态 | 引入 | 弃用 | 备注 |
|---|---|---|---|---|
| 7.17.0 | active | — | — | — |
| 8.11.0 | active | — | — | — |
| 8.12.0 | active | — | — | — |
根因分析
持久化任务(例如ILM、Rollup、Watcher)由于节点属性不匹配、资源限制或滚动重启期间集群拓扑变化而无法分配给任何可用节点。
English
A persistent task (e.g., ILM, Rollup, Watcher) cannot be assigned to any available node because of node attribute mismatches, resource constraints, or cluster topology changes during rolling restart.
官方文档
https://www.elastic.co/guide/en/elasticsearch/reference/current/tasks.html解决方案
-
Ensure all nodes have the required attributes set in `elasticsearch.yml` (e.g., `node.attr.rack: r1`) and restart nodes one by one, waiting for shard recovery after each restart.
-
Use the `_tasks` API to reassign the task manually: `POST _tasks/task_id_123/_cancel` then `POST _tasks/task_id_123/_retry`.
-
Check node resource availability (CPU, memory) and scale up or add more nodes to the cluster to free up capacity.
无效尝试
常见但无效的做法:
-
85% 失败
Forceful restart causes more assignment failures as tasks lose their target nodes and can't reassign mid-restart.
-
75% 失败
Retries don't fix the underlying node attribute or resource issue; they only delay the eventual failure.
-
90% 失败
This removes the task but loses its progress, and the task may be recreated by the system (e.g., ILM) causing the same error again.