kafka runtime_error ai_generated true

org.apache.kafka.common.errors.CoordinatorLoadInProgressException: 协调器正在加载,无法处理请求

org.apache.kafka.common.errors.CoordinatorLoadInProgressException: The coordinator is loading and cannot process requests

ID: kafka/coordinator-load-in-progress

其他格式: JSON · Markdown 中文 · English
85%修复率
88%置信度
1证据数
2023-08-20首次发现

版本兼容性

版本状态引入弃用备注
Kafka 3.3.0 active
Kafka 3.4.0 active
Kafka 3.5.0 active

根因分析

在领导者选举或代理重启后,组或事务协调器仍从内部主题(如__consumer_offsets)加载状态。

English

Group or transaction coordinator is still loading state from internal topics (e.g., __consumer_offsets) after a leader election or broker restart.

generic

官方文档

https://kafka.apache.org/documentation/#coordinator

解决方案

  1. Wait for coordinator to finish loading; check broker logs for 'Finished loading offsets from __consumer_offsets'.
    Command to monitor:
    kafka-consumer-groups --bootstrap-server localhost:9092 --group my-group --describe --members --verbose
    If loading takes too long, increase offsets.load.buffer.size in broker config:
    echo "offsets.load.buffer.size=10485760" >> config/server.properties
    kafka-server-start.sh config/server.properties
  2. Use exponential backoff in consumer retry logic to avoid overwhelming the coordinator.
    Code example:
    int retries = 0;
    while (retries < 5) {
      try {
        consumer.poll(Duration.ofMillis(1000));
        break;
      } catch (CoordinatorLoadInProgressException e) {
        Thread.sleep((long) Math.pow(2, retries) * 1000);
        retries++;
      }
    }

无效尝试

常见但无效的做法:

  1. Restart the consumer application immediately 90% 失败

    Consumer restart does not speed up coordinator loading; it only retries the same request.

  2. Delete __consumer_offsets topic to reset state 99% 失败

    Deleting internal topics corrupts the cluster and causes data loss for all consumer groups.

  3. Reduce offsets.topic.num.partitions to 1 80% 失败

    Changing partition count requires cluster restart and does not fix loading speed.