K8S-LEADER-001 kubernetes system_error ai_generated true

选举：领导者选举丢失

Election: leader election lost

ID: kubernetes/leader-election-lost

其他格式: JSON · Markdown 中文 · English

80%修复率

85%置信度

1证据数

2023-06-15首次发现

版本兼容性

控制器或操作器 Pod 因网络分区、Pod 重启或 etcd 超时而丢失租约锁，导致临时领导权空缺。

A controller or operator pod lost its lease lock due to network partition, pod restart, or etcd timeout, causing a temporary leadership gap.

generic

将控制器 Deployment 缩容至 0，等待 30 秒，再扩容至 1，以强制进行干净的领导者选举。

检查可能阻止控制器副本之间在端口 2380（etcd 对等端口）上通信的网络策略或防火墙规则。

常见但无效的做法:

Restart all replicas of the controller simultaneously. 65% 失败
Restarting all replicas at once can cause a prolonged leader election storm, making the problem worse.
Delete the lease object in etcd manually. 80% 失败
Manually deleting the lease may cause data inconsistency and is not recommended; the leader election mechanism should self-heal.