kubernetes
system_error
ai_generated
true
错误:节点 'worker-node-1' 未找到 — kubelet 未发布节点状态
Error: node 'worker-node-1' not found — kubelet is not posting node status
ID: kubernetes/kubelet-node-status-notfound
82%修复率
88%置信度
1证据数
2023-04-05首次发现
版本兼容性
| 版本 | 状态 | 引入 | 弃用 | 备注 |
|---|---|---|---|---|
| Kubernetes 1.24 | active | — | — | — |
| Kubernetes 1.25 | active | — | — | — |
| Kubernetes 1.26 | active | — | — | — |
| kubeadm 1.25.0 | active | — | — | — |
根因分析
节点上的kubelet已停止向API服务器报告其状态,通常由于kubelet崩溃、网络断开或证书过期,导致节点被标记为NotReady或移除。
English
Kubelet on the node has stopped reporting its status to the API server, often due to kubelet crash, network disconnection, or certificate expiration, causing the node to be marked as NotReady or removed.
官方文档
https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/解决方案
-
SSH进入节点并检查kubelet状态:`systemctl status kubelet`。如果已停止,重启:`systemctl restart kubelet`。然后检查日志:`journalctl -u kubelet -n 50`。常见原因:证书过期(`openssl x509 -in /var/lib/kubelet/pki/kubelet-client-current.pem -text -noout`)。如果过期,轮换kubelet证书:`kubeadm certs renew kubelet`并重启kubelet。
-
如果节点不可达,从集群中删除节点对象:`kubectl delete node worker-node-1`。然后使用正确的令牌通过`kubeadm join`重新加入节点。这强制进行新注册。
无效尝试
常见但无效的做法:
-
90% 失败
Restarting the API server does not fix the node issue; the kubelet must be fixed on the node itself.
-
70% 失败
Deleting and re-creating the node object in Kubernetes without fixing the kubelet will result in the same error because the new node will also fail to report status.