kafka runtime_error ai_generated true

org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /brokers/ids

ID: kafka/zk-timeout-on-startup

Also available as: JSON · Markdown · 中文
78%Fix Rate
85%Confidence
1Evidence
2023-08-15First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
Kafka 3.4.0 active
ZooKeeper 3.8.1 active
Kafka 3.5.1 active

Root Cause

Kafka broker fails to register with ZooKeeper due to network timeout or ZooKeeper cluster unavailability during startup.

generic

中文

Kafka代理在启动时因网络超时或ZooKeeper集群不可用而无法向ZooKeeper注册。

Official Documentation

https://kafka.apache.org/documentation/#zk_connection_loss

Workarounds

  1. 85% success Check ZooKeeper connectivity: `echo stat | nc zookeeper1 2181` and ensure ZooKeeper ensemble is healthy (quorum formed). Then verify Kafka broker can resolve ZooKeeper hostnames via DNS.
    Check ZooKeeper connectivity: `echo stat | nc zookeeper1 2181` and ensure ZooKeeper ensemble is healthy (quorum formed). Then verify Kafka broker can resolve ZooKeeper hostnames via DNS.
  2. 75% success Increase `zookeeper.connection.timeout.ms` in broker config to 30000 (30 seconds) and reduce `zookeeper.session.timeout.ms` to 18000 to allow more time for initial handshake.
    Increase `zookeeper.connection.timeout.ms` in broker config to 30000 (30 seconds) and reduce `zookeeper.session.timeout.ms` to 18000 to allow more time for initial handshake.
  3. 90% success Example fix: set `zookeeper.connect=zk1:2181,zk2:2181,zk3:2181/kafka` and ensure `zookeeper.connection.timeout.ms=30000` in server.properties.
    Example fix: set `zookeeper.connect=zk1:2181,zk2:2181,zk3:2181/kafka` and ensure `zookeeper.connection.timeout.ms=30000` in server.properties.

中文步骤

  1. Check ZooKeeper connectivity: `echo stat | nc zookeeper1 2181` and ensure ZooKeeper ensemble is healthy (quorum formed). Then verify Kafka broker can resolve ZooKeeper hostnames via DNS.
  2. Increase `zookeeper.connection.timeout.ms` in broker config to 30000 (30 seconds) and reduce `zookeeper.session.timeout.ms` to 18000 to allow more time for initial handshake.
  3. Example fix: set `zookeeper.connect=zk1:2181,zk2:2181,zk3:2181/kafka` and ensure `zookeeper.connection.timeout.ms=30000` in server.properties.

Dead Ends

Common approaches that don't work:

  1. 65% fail

    It masks the symptom but does not fix the root cause of intermittent connectivity.

  2. 80% fail

    The broker will still fail to connect if ZooKeeper is down or misconfigured.