mongodb runtime_error ai_generated true

MongoServerError: ReplicaSetMonitor: election failed for set rs0

ID: mongodb/replica-set-election-failed

Also available as: JSON · Markdown · 中文
85%Fix Rate
85%Confidence
1Evidence
2023-08-15First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
MongoDB 5.0 active
MongoDB 6.0 active
MongoDB 7.0 active

Root Cause

The replica set election failed due to a tie or network partition preventing a majority of nodes from agreeing on a primary.

generic

中文

副本集选举失败,因为平局或网络分区阻止大多数节点就主节点达成一致。

Official Documentation

https://www.mongodb.com/docs/manual/reference/replica-configuration/

Workarounds

  1. 85% success Force a new election by stepping down the current primary with a specific timeout: rs.stepDown(60, { force: true })
    Force a new election by stepping down the current primary with a specific timeout: rs.stepDown(60, { force: true })
  2. 90% success Temporarily increase electionTimeoutMillis on all nodes to allow more time for consensus: cfg = rs.conf(); cfg.settings.electionTimeoutMillis = 15000; rs.reconfig(cfg)
    Temporarily increase electionTimeoutMillis on all nodes to allow more time for consensus: cfg = rs.conf(); cfg.settings.electionTimeoutMillis = 15000; rs.reconfig(cfg)
  3. 80% success If a tie persists, manually set a node's priority to 0 to remove it from contention, then reconfig: cfg = rs.conf(); cfg.members[1].priority = 0; rs.reconfig(cfg)
    If a tie persists, manually set a node's priority to 0 to remove it from contention, then reconfig: cfg = rs.conf(); cfg.members[1].priority = 0; rs.reconfig(cfg)

中文步骤

  1. Force a new election by stepping down the current primary with a specific timeout: rs.stepDown(60, { force: true })
  2. Temporarily increase electionTimeoutMillis on all nodes to allow more time for consensus: cfg = rs.conf(); cfg.settings.electionTimeoutMillis = 15000; rs.reconfig(cfg)
  3. If a tie persists, manually set a node's priority to 0 to remove it from contention, then reconfig: cfg = rs.conf(); cfg.members[1].priority = 0; rs.reconfig(cfg)

Dead Ends

Common approaches that don't work:

  1. 70% fail

    Restarting nodes doesn't resolve the underlying voting deadlock; it may reset timers but the same tie condition persists.

  2. 60% fail

    rs.stepDown() only triggers a new election without addressing the root cause of the tie.

  3. 80% fail

    Adding more nodes doesn't automatically break ties; it requires adjusting electionTimeoutMillis or priority settings.