SENTINEL redis network_error ai_generated partial

SENTINEL: arbiter node timeout, failover aborted for master 'mymaster'

ID: redis/sentinel-arbiter-timeout-failover

Also available as: JSON · Markdown · 中文
78%Fix Rate
84%Confidence
1Evidence
2023-08-22First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
6.2 active
7.0 active
7.2 active

Root Cause

A Sentinel arbiter node failed to respond within the configured timeout during a failover vote, preventing quorum from being reached.

generic

中文

故障转移投票期间,Sentinel 仲裁节点未在配置的超时时间内响应,导致无法达到法定人数。

Official Documentation

https://redis.io/docs/latest/operate/oss_admin/sentinel/

Workarounds

  1. 80% success Increase the sentinel failover-timeout to give the arbiter more time to respond. Example: SENTINEL SET mymaster failover-timeout 60000 (from 30000 ms default)
    Increase the sentinel failover-timeout to give the arbiter more time to respond. Example: SENTINEL SET mymaster failover-timeout 60000 (from 30000 ms default)
  2. 85% success Check network connectivity between the Sentinel nodes and the arbiter; reduce latency by ensuring they are in the same datacenter or use a faster network path.
    Check network connectivity between the Sentinel nodes and the arbiter; reduce latency by ensuring they are in the same datacenter or use a faster network path.
  3. 75% success If the arbiter is unreliable, replace it with a more stable Sentinel instance or remove it from the sentinel monitor configuration.
    If the arbiter is unreliable, replace it with a more stable Sentinel instance or remove it from the sentinel monitor configuration.

中文步骤

  1. Increase the sentinel failover-timeout to give the arbiter more time to respond. Example: SENTINEL SET mymaster failover-timeout 60000 (from 30000 ms default)
  2. Check network connectivity between the Sentinel nodes and the arbiter; reduce latency by ensuring they are in the same datacenter or use a faster network path.
  3. If the arbiter is unreliable, replace it with a more stable Sentinel instance or remove it from the sentinel monitor configuration.

Dead Ends

Common approaches that don't work:

  1. Increase the sentinel monitor quorum to a higher value to require more votes. 60% fail

    This increases the required votes but does not fix the arbiter timeout; it may make failover even harder.

  2. Set all Sentinels to have equal weight by removing arbiter designation. 70% fail

    Arbiters are a concept in some setups; in Redis Sentinel, all nodes vote equally, and removing designation does not address timeout.

  3. Restart the arbiter Sentinel node to clear its state. 50% fail

    If the timeout is due to network or resource issues, restarting provides only temporary relief.