ERR redis runtime_error ai_generated true

节点纪元冲突：节点 127.0.0.1:7001 的纪元为 100，但另一个节点 127.0.0.1:7002 声称具有相同纪元

ERR Node epoch conflict: node 127.0.0.1:7001 has epoch 100, but another node 127.0.0.1:7002 claims same epoch

ID: redis/cluster-node-epoch-conflict

其他格式: JSON · Markdown 中文 · English

80%修复率

83%置信度

1证据数

2023-12-11首次发现

版本兼容性

版本	状态	引入	弃用	备注
6.2	active	—	—	—
7.0	active	—	—	—
7.2	active	—	—	—

根因分析

两个 Redis 集群节点具有相同的纪元号，导致集群状态同步冲突，阻止了正确的故障转移或配置更新。

English

Two Redis cluster nodes have the same epoch number, causing a conflict in cluster state synchronization and preventing proper failover or configuration updates.

generic

官方文档

https://redis.io/docs/latest/operate/oss_admin/cluster-spec/

解决方案

Use CLUSTER RESET on one of the conflicting nodes to reset its epoch to 0, then rejoin the cluster. Example: redis-cli -h 127.0.0.1 -p 7001 CLUSTER RESET HARD

Force the cluster to rebalance epochs by failing over the master on the node with the conflicting epoch. Example: CLUSTER FAILOVER FORCE on a replica of that master.

Manually update the nodes.conf file on one node to assign a unique epoch (e.g., increment by 1) and restart the node.

无效尝试

常见但无效的做法:

Manually set the epoch on one node using CLUSTER SET-CONFIG-EPOCH with a random value. 60% 失败
This can cause further conflicts if not coordinated; the cluster may reject the change if the epoch is lower than the current one.
Restart both conflicting nodes simultaneously. 80% 失败
The epochs are stored in the nodes.conf file; restarting does not change them, so the conflict persists.
Delete the nodes.conf file on one node and let it rejoin the cluster. 50% 失败
This can cause data loss and disrupt cluster topology; the rejoining node may still get a conflicting epoch.