ERR redis runtime_error ai_generated true

ERR Node epoch conflict: node 127.0.0.1:7001 has epoch 100, but another node 127.0.0.1:7002 claims same epoch

ID: redis/cluster-node-epoch-conflict

Also available as: JSON · Markdown · 中文

80%Fix Rate

83%Confidence

1Evidence

2023-12-11First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
6.2	active	—	—	—
7.0	active	—	—	—
7.2	active	—	—	—

Root Cause

Two Redis cluster nodes have the same epoch number, causing a conflict in cluster state synchronization and preventing proper failover or configuration updates.

generic

中文

两个 Redis 集群节点具有相同的纪元号，导致集群状态同步冲突，阻止了正确的故障转移或配置更新。

Official Documentation

https://redis.io/docs/latest/operate/oss_admin/cluster-spec/

Workarounds

85% success Use CLUSTER RESET on one of the conflicting nodes to reset its epoch to 0, then rejoin the cluster. Example: redis-cli -h 127.0.0.1 -p 7001 CLUSTER RESET HARD
```
Use CLUSTER RESET on one of the conflicting nodes to reset its epoch to 0, then rejoin the cluster. Example: redis-cli -h 127.0.0.1 -p 7001 CLUSTER RESET HARD
```
80% success Force the cluster to rebalance epochs by failing over the master on the node with the conflicting epoch. Example: CLUSTER FAILOVER FORCE on a replica of that master.
```
Force the cluster to rebalance epochs by failing over the master on the node with the conflicting epoch. Example: CLUSTER FAILOVER FORCE on a replica of that master.
```
75% success Manually update the nodes.conf file on one node to assign a unique epoch (e.g., increment by 1) and restart the node.
```
Manually update the nodes.conf file on one node to assign a unique epoch (e.g., increment by 1) and restart the node.
```

中文步骤

Use CLUSTER RESET on one of the conflicting nodes to reset its epoch to 0, then rejoin the cluster. Example: redis-cli -h 127.0.0.1 -p 7001 CLUSTER RESET HARD

Force the cluster to rebalance epochs by failing over the master on the node with the conflicting epoch. Example: CLUSTER FAILOVER FORCE on a replica of that master.

Manually update the nodes.conf file on one node to assign a unique epoch (e.g., increment by 1) and restart the node.

Dead Ends

Common approaches that don't work:

Manually set the epoch on one node using CLUSTER SET-CONFIG-EPOCH with a random value. 60% fail
This can cause further conflicts if not coordinated; the cluster may reject the change if the epoch is lower than the current one.
Restart both conflicting nodes simultaneously. 80% fail
The epochs are stored in the nodes.conf file; restarting does not change them, so the conflict persists.
Delete the nodes.conf file on one node and let it rejoin the cluster. 50% fail
This can cause data loss and disrupt cluster topology; the rejoining node may still get a conflicting epoch.