kafka
system_error
ai_generated
partial
LogDirOfflineException: One or more log directories are offline.
ID: kafka/log-dir-offline
70%Fix Rate
83%Confidence
1Evidence
2024-04-02First Seen
Root Cause
A disk failure or filesystem issue has caused one or more Kafka data directories to become inaccessible, leading to broker unavailability for those partitions.
generic中文
磁盘故障或文件系统问题导致一个或多个 Kafka 数据目录无法访问,从而导致这些分区的代理不可用。
Official Documentation
https://kafka.apache.org/documentation/#log_dirsWorkarounds
-
70% success Identify the offline directory from broker logs: `grep 'offline' /var/log/kafka/server.log`. Then unmount and check the disk with `fsck`, or replace it. After repair, restart the broker. Example: `sudo umount /data/kafka && sudo fsck -y /dev/sdb1 && sudo mount /data/kafka && kafka-server-start.sh config/server.properties`.
Identify the offline directory from broker logs: `grep 'offline' /var/log/kafka/server.log`. Then unmount and check the disk with `fsck`, or replace it. After repair, restart the broker. Example: `sudo umount /data/kafka && sudo fsck -y /dev/sdb1 && sudo mount /data/kafka && kafka-server-start.sh config/server.properties`.
中文步骤
Identify the offline directory from broker logs: `grep 'offline' /var/log/kafka/server.log`. Then unmount and check the disk with `fsck`, or replace it. After repair, restart the broker. Example: `sudo umount /data/kafka && sudo fsck -y /dev/sdb1 && sudo mount /data/kafka && kafka-server-start.sh config/server.properties`.
Dead Ends
Common approaches that don't work:
-
90% fail
Simply restarting the broker without addressing the disk failure will cause the same error; the broker will detect the offline directory again.
-
95% fail
Increasing log retention or cleanup policies doesn't fix a hardware failure; the underlying disk must be repaired or replaced.