kafka runtime_error ai_generated partial

org.apache.kafka.common.errors.UnknownServerException: The server experienced an unexpected error.

ID: kafka/unknown-server-exception

Also available as: JSON · Markdown · 中文
75%Fix Rate
85%Confidence
1Evidence
2024-03-15First Seen

Root Cause

An internal broker error occurred, often due to a corrupted message or disk I/O failure that the broker cannot classify.

generic

中文

内部代理错误发生,通常是由于消息损坏或磁盘 I/O 故障,代理无法分类。

Official Documentation

https://kafka.apache.org/documentation/#error_unknown_server_exception

Workarounds

  1. 70% success Check broker logs for detailed stack trace: `grep -i 'UnknownServerException' /var/log/kafka/server.log`. Then run `fsck` on the data directory or replace the disk if bad sectors are detected.
    Check broker logs for detailed stack trace: `grep -i 'UnknownServerException' /var/log/kafka/server.log`. Then run `fsck` on the data directory or replace the disk if bad sectors are detected.
  2. 80% success If corruption is isolated to a topic partition, use `kafka-reassign-partitions.sh` to move replicas to a healthy broker and delete the corrupt data directory.
    If corruption is isolated to a topic partition, use `kafka-reassign-partitions.sh` to move replicas to a healthy broker and delete the corrupt data directory.

中文步骤

  1. Check broker logs for detailed stack trace: `grep -i 'UnknownServerException' /var/log/kafka/server.log`. Then run `fsck` on the data directory or replace the disk if bad sectors are detected.
  2. If corruption is isolated to a topic partition, use `kafka-reassign-partitions.sh` to move replicas to a healthy broker and delete the corrupt data directory.

Dead Ends

Common approaches that don't work:

  1. 80% fail

    Restarting the broker without checking logs or disk health only delays the error; the underlying corruption persists and will recur.

  2. 90% fail

    Increasing replication factor doesn't fix the corrupt data on the affected broker; it just adds replicas that may also encounter the issue.