ERR redis network_error ai_generated partial

CLUSTERDOWN 集群已关闭 - 节点超时,副本未同步

ERR CLUSTERDOWN The cluster is down - node timeout, replica not synced

ID: redis/cluster-node-timeout-replica

其他格式: JSON · Markdown 中文 · English
82%修复率
87%置信度
1证据数
2024-01-18首次发现

版本兼容性

版本状态引入弃用备注
7.2.0 active
7.4.0 active
8.0.0 active

根因分析

集群节点超时且其副本未完全同步,导致集群失去法定人数并将自身标记为关闭。

English

A cluster node timed out and its replica is not fully synced, causing the cluster to lose quorum and mark itself as down.

generic

官方文档

https://redis.io/docs/manual/scaling/

解决方案

  1. 强制副本同步:CLUSTER REPLICATE <主节点ID>。然后使用 CLUSTER INFO 等待复制完成。
  2. 如果主节点永久关闭,将副本提升为主节点:在副本节点上执行 CLUSTER FAILOVER FORCE。
  3. 增加 cluster-node-timeout 到更高值(例如 30000ms)以容忍临时网络问题:CONFIG SET cluster-node-timeout 30000。

无效尝试

常见但无效的做法:

  1. 70% 失败

    Restarting the failed node without fixing the replica sync will cause the same timeout again because the replica is still behind.

  2. 50% 失败

    Increasing cluster-node-timeout alone without addressing network issues or replica sync lag will not prevent future timeouts.