# etcdserver：请求超时，可能正在进行领导者选举

- **ID:** `kubernetes/etcd-leader-election-failure`
- **领域:** kubernetes
- **类别:** system_error
- **验证级别:** ai_generated
- **修复率:** 70%

## 根因

etcd 集群遇到网络分区或磁盘 I/O 延迟，导致领导者选举失败或耗时过长，从而导致 Kubernetes API 请求超时。

## 版本兼容性

| 版本 | 状态 | 引入 | 弃用 |
|------|------|------|------|
| etcd 3.5.7 | active | — | — |
| etcd 3.5.9 | active | — | — |
| Kubernetes 1.27 | active | — | — |
| Kubernetes 1.29 | active | — | — |

## 解决方案

1. ```
   Check etcd cluster health: `etcdctl endpoint health --cluster`. Identify unhealthy members and check their disk I/O with `iostat -x 1` or network latency with `ping` between etcd nodes.
   ```
2. ```
   If disk I/O is high, move etcd data directory to a faster disk (e.g., SSD) by updating the etcd pod spec's hostPath or using a dedicated volume: `--data-dir=/var/lib/etcd-ssd`.
   ```
3. ```
   If network partition is suspected, ensure all etcd members can communicate on port 2380 (peer communication). Check firewall rules and network policies.
   ```

## 无效尝试

- **** — Simply restarting one etcd member may worsen the situation by triggering another leader election. (70% 失败率)
- **** — Increasing etcd request timeout without fixing underlying disk or network issues only masks the problem temporarily. (60% 失败率)
