elasticsearch data_error ai_generated true

事务日志损坏:在位置 12345 检测到事务日志损坏

TranslogCorruptedException: translog corruption detected at position 12345

ID: elasticsearch/translog-corruption-on-flush

其他格式: JSON · Markdown 中文 · English
85%修复率
88%置信度
1证据数
2024-06-20首次发现

版本兼容性

版本状态引入弃用备注
7.17.15 active
8.7.0 active
8.13.2 active

根因分析

事务日志文件因节点突然崩溃、磁盘 I/O 错误或刷新操作期间的文件系统不一致而损坏。

English

The translog file is corrupted due to a sudden node crash, disk I/O error, or file system inconsistency during a flush operation.

generic

官方文档

https://www.elastic.co/guide/en/elasticsearch/reference/current/troubleshooting.html

解决方案

  1. 使用 Elasticsearch 'elasticsearch-shard' CLI 工具截断事务日志:./bin/elasticsearch-shard remove-corrupted-data --index my_index --shard 0。仅移除损坏部分并恢复分片。
  2. 如果分片是副本,从主分片分配新副本:POST /_cluster/reroute { "commands": [{ "allocate_replica": { "index": "my_index", "shard": 0, "node": "my_node" } }] },然后删除损坏的分片。

无效尝试

常见但无效的做法:

  1. 95% 失败

    Deleting the translog directly causes data loss and may leave the index in an inconsistent state that cannot be recovered.

  2. 90% 失败

    A corrupted translog cannot be replayed; Elasticsearch will fail to open the shard and the error persists.