ERR redis runtime_error ai_generated true

节点纪元冲突:节点 127.0.0.1:7001 的纪元为 100,但另一个节点 127.0.0.1:7002 声称具有相同纪元

ERR Node epoch conflict: node 127.0.0.1:7001 has epoch 100, but another node 127.0.0.1:7002 claims same epoch

ID: redis/cluster-node-epoch-conflict

其他格式: JSON · Markdown 中文 · English
80%修复率
83%置信度
1证据数
2023-12-11首次发现

版本兼容性

版本状态引入弃用备注
6.2 active
7.0 active
7.2 active

根因分析

两个 Redis 集群节点具有相同的纪元号,导致集群状态同步冲突,阻止了正确的故障转移或配置更新。

English

Two Redis cluster nodes have the same epoch number, causing a conflict in cluster state synchronization and preventing proper failover or configuration updates.

generic

官方文档

https://redis.io/docs/latest/operate/oss_admin/cluster-spec/

解决方案

  1. Use CLUSTER RESET on one of the conflicting nodes to reset its epoch to 0, then rejoin the cluster. Example: redis-cli -h 127.0.0.1 -p 7001 CLUSTER RESET HARD
  2. Force the cluster to rebalance epochs by failing over the master on the node with the conflicting epoch. Example: CLUSTER FAILOVER FORCE on a replica of that master.
  3. Manually update the nodes.conf file on one node to assign a unique epoch (e.g., increment by 1) and restart the node.

无效尝试

常见但无效的做法:

  1. Manually set the epoch on one node using CLUSTER SET-CONFIG-EPOCH with a random value. 60% 失败

    This can cause further conflicts if not coordinated; the cluster may reject the change if the epoch is lower than the current one.

  2. Restart both conflicting nodes simultaneously. 80% 失败

    The epochs are stored in the nodes.conf file; restarting does not change them, so the conflict persists.

  3. Delete the nodes.conf file on one node and let it rejoin the cluster. 50% 失败

    This can cause data loss and disrupt cluster topology; the rejoining node may still get a conflicting epoch.