database network_error ai_generated partial

redis.exceptions.MasterDownError: Error: Master is down or unreachable

ID: database/redis-master-link-down

Also available as: JSON · Markdown · 中文

85%Fix Rate

86%Confidence

1Evidence

2024-03-15First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
Redis 6.2.x	active	—	—	—
Redis 7.0.x	active	—	—	—
Redis 7.2.x	active	—	—	—

Root Cause

The Redis client cannot connect to the master node in a replication setup, often due to network partition, master crash, or misconfigured replicaof directive.

generic

中文

Redis 客户端无法连接到复制设置中的主节点，通常由于网络分区、主节点崩溃或 replicaof 指令配置错误。

Official Documentation

https://redis.io/docs/latest/operate/oss_and_stack/management/replication/

Workarounds

85% success Check master status from replica: redis-cli -h replica_host INFO replication | grep master_link_status; if down, check master: redis-cli -h master_host PING; restart master if needed: systemctl restart redis-server
```
Check master status from replica: redis-cli -h replica_host INFO replication | grep master_link_status; if down, check master: redis-cli -h master_host PING; restart master if needed: systemctl restart redis-server
```
90% success Promote replica to master in a failover scenario: redis-cli -h replica_host SLAVEOF NO ONE; then reconfigure other replicas to point to the new master.
```
Promote replica to master in a failover scenario: redis-cli -h replica_host SLAVEOF NO ONE; then reconfigure other replicas to point to the new master.
```

中文步骤

从副本检查主节点状态：redis-cli -h replica_host INFO replication | grep master_link_status；如果为 down，检查主节点：redis-cli -h master_host PING；如果需要，重启主节点：systemctl restart redis-server

在故障转移场景中将副本提升为主节点：redis-cli -h replica_host SLAVEOF NO ONE；然后重新配置其他副本指向新主节点。

Dead Ends

Common approaches that don't work:

Restarting only the replica node 90% fail
If the master is down, restarting the replica does not fix the connection; the replica will still fail to sync.
Increasing replica timeout in redis.conf without checking master health 70% fail
Increasing timeout only delays the error; it does not address the root cause of master unavailability.