{
  "id": "redis/cluster-node-timeout-replica",
  "signature": "ERR CLUSTERDOWN The cluster is down - node timeout, replica not synced",
  "signature_zh": "CLUSTERDOWN 集群已关闭 - 节点超时，副本未同步",
  "regex": "ERR CLUSTERDOWN The cluster is down - node timeout, replica not synced",
  "domain": "redis",
  "category": "network_error",
  "subcategory": null,
  "root_cause": "A cluster node timed out and its replica is not fully synced, causing the cluster to lose quorum and mark itself as down.",
  "root_cause_type": "generic",
  "root_cause_zh": "集群节点超时且其副本未完全同步，导致集群失去法定人数并将自身标记为关闭。",
  "versions": [
    {
      "version": "7.2.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "7.4.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    },
    {
      "version": "8.0.0",
      "introduced": null,
      "deprecated": null,
      "removed": null,
      "behavior_change": null,
      "status": "active"
    }
  ],
  "os_specific": {},
  "dead_ends": [
    {
      "action": "",
      "why_fails": "Restarting the failed node without fixing the replica sync will cause the same timeout again because the replica is still behind.",
      "fail_rate": 0.7,
      "condition": "",
      "sources": []
    },
    {
      "action": "",
      "why_fails": "Increasing cluster-node-timeout alone without addressing network issues or replica sync lag will not prevent future timeouts.",
      "fail_rate": 0.5,
      "condition": "",
      "sources": []
    }
  ],
  "workarounds": [
    {
      "action": "Force the replica to sync: CLUSTER REPLICATE <master-node-id>. Then wait for replication to complete using CLUSTER INFO.",
      "success_rate": 0.8,
      "how": "Force the replica to sync: CLUSTER REPLICATE <master-node-id>. Then wait for replication to complete using CLUSTER INFO.",
      "condition": "",
      "sources": []
    },
    {
      "action": "If the master is permanently down, promote the replica to master: CLUSTER FAILOVER FORCE on the replica node.",
      "success_rate": 0.85,
      "how": "If the master is permanently down, promote the replica to master: CLUSTER FAILOVER FORCE on the replica node.",
      "condition": "",
      "sources": []
    },
    {
      "action": "Increase cluster-node-timeout to a higher value (e.g., 30000ms) to tolerate transient network issues: CONFIG SET cluster-node-timeout 30000.",
      "success_rate": 0.7,
      "how": "Increase cluster-node-timeout to a higher value (e.g., 30000ms) to tolerate transient network issues: CONFIG SET cluster-node-timeout 30000.",
      "condition": "",
      "sources": []
    }
  ],
  "workarounds_zh": [
    "强制副本同步：CLUSTER REPLICATE <主节点ID>。然后使用 CLUSTER INFO 等待复制完成。",
    "如果主节点永久关闭，将副本提升为主节点：在副本节点上执行 CLUSTER FAILOVER FORCE。",
    "增加 cluster-node-timeout 到更高值（例如 30000ms）以容忍临时网络问题：CONFIG SET cluster-node-timeout 30000。"
  ],
  "transition_graph": {
    "leads_to": [],
    "preceded_by": [],
    "frequently_confused_with": []
  },
  "official_doc_url": "https://redis.io/docs/manual/scaling/",
  "official_doc_section": null,
  "error_code": "ERR",
  "verification_tier": "ai_generated",
  "confidence": 0.87,
  "fix_success_rate": 0.82,
  "resolvable": "partial",
  "first_seen": "2024-01-18",
  "last_confirmed": "2024-06-01",
  "last_updated": "2024-06-01",
  "evidence_count": 1,
  "tags": [],
  "locale": "en",
  "aliases": []
}