kafka runtime_error ai_generated true

org.apache.kafka.common.errors.CoordinatorLoadInProgressException: The coordinator is loading and cannot process requests

ID: kafka/coordinator-load-in-progress

Also available as: JSON · Markdown · 中文
85%Fix Rate
88%Confidence
1Evidence
2023-08-20First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
Kafka 3.3.0 active
Kafka 3.4.0 active
Kafka 3.5.0 active

Root Cause

Group or transaction coordinator is still loading state from internal topics (e.g., __consumer_offsets) after a leader election or broker restart.

generic

中文

在领导者选举或代理重启后,组或事务协调器仍从内部主题(如__consumer_offsets)加载状态。

Official Documentation

https://kafka.apache.org/documentation/#coordinator

Workarounds

  1. 85% success Wait for coordinator to finish loading; check broker logs for 'Finished loading offsets from __consumer_offsets'. Command to monitor: kafka-consumer-groups --bootstrap-server localhost:9092 --group my-group --describe --members --verbose If loading takes too long, increase offsets.load.buffer.size in broker config: echo "offsets.load.buffer.size=10485760" >> config/server.properties kafka-server-start.sh config/server.properties
    Wait for coordinator to finish loading; check broker logs for 'Finished loading offsets from __consumer_offsets'.
    Command to monitor:
    kafka-consumer-groups --bootstrap-server localhost:9092 --group my-group --describe --members --verbose
    If loading takes too long, increase offsets.load.buffer.size in broker config:
    echo "offsets.load.buffer.size=10485760" >> config/server.properties
    kafka-server-start.sh config/server.properties
  2. 80% success Use exponential backoff in consumer retry logic to avoid overwhelming the coordinator. Code example: int retries = 0; while (retries < 5) { try { consumer.poll(Duration.ofMillis(1000)); break; } catch (CoordinatorLoadInProgressException e) { Thread.sleep((long) Math.pow(2, retries) * 1000); retries++; } }
    Use exponential backoff in consumer retry logic to avoid overwhelming the coordinator.
    Code example:
    int retries = 0;
    while (retries < 5) {
      try {
        consumer.poll(Duration.ofMillis(1000));
        break;
      } catch (CoordinatorLoadInProgressException e) {
        Thread.sleep((long) Math.pow(2, retries) * 1000);
        retries++;
      }
    }

中文步骤

  1. Wait for coordinator to finish loading; check broker logs for 'Finished loading offsets from __consumer_offsets'.
    Command to monitor:
    kafka-consumer-groups --bootstrap-server localhost:9092 --group my-group --describe --members --verbose
    If loading takes too long, increase offsets.load.buffer.size in broker config:
    echo "offsets.load.buffer.size=10485760" >> config/server.properties
    kafka-server-start.sh config/server.properties
  2. Use exponential backoff in consumer retry logic to avoid overwhelming the coordinator.
    Code example:
    int retries = 0;
    while (retries < 5) {
      try {
        consumer.poll(Duration.ofMillis(1000));
        break;
      } catch (CoordinatorLoadInProgressException e) {
        Thread.sleep((long) Math.pow(2, retries) * 1000);
        retries++;
      }
    }

Dead Ends

Common approaches that don't work:

  1. Restart the consumer application immediately 90% fail

    Consumer restart does not speed up coordinator loading; it only retries the same request.

  2. Delete __consumer_offsets topic to reset state 99% fail

    Deleting internal topics corrupts the cluster and causes data loss for all consumer groups.

  3. Reduce offsets.topic.num.partitions to 1 80% fail

    Changing partition count requires cluster restart and does not fix loading speed.