kafka runtime_error ai_generated true

org.apache.kafka.common.errors.RebalanceInProgressException: The group is rebalancing, so a rebalance is already in progress.

ID: kafka/group-rebalance-timeout

Also available as: JSON · Markdown · 中文

75%Fix Rate

82%Confidence

1Evidence

2023-03-10First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
Kafka 2.8.0	active	—	—	—
Kafka 3.0.0	active	—	—	—
Kafka 3.4.0	active	—	—	—
Kafka 3.6.0	active	—	—	—

Root Cause

A consumer request (like offset commit or join group) was made while a consumer group rebalance was already in progress, causing the request to be rejected.

generic

中文

在消费者组重新平衡正在进行时，消费者请求（如偏移提交或加入组）被发出，导致请求被拒绝。

Official Documentation

https://kafka.apache.org/documentation/#consumerconfigs_max.poll.interval.ms

Workarounds

80% success Increase max.poll.interval.ms to allow more time for processing between polls, reducing the chance of rebalance being triggered: Properties props = new Properties(); props.put("max.poll.interval.ms", 600000); // 10 minutes props.put("max.poll.records", 500); // Fewer records per poll And ensure the consumer processes records quickly or uses async processing.
```
Increase max.poll.interval.ms to allow more time for processing between polls, reducing the chance of rebalance being triggered:

Properties props = new Properties();
props.put("max.poll.interval.ms", 600000);  // 10 minutes
props.put("max.poll.records", 500);  // Fewer records per poll

And ensure the consumer processes records quickly or uses async processing.
```
75% success Handle RebalanceInProgressException in the consumer loop by catching the exception and retrying after a short delay: try { consumer.commitSync(); } catch (RebalanceInProgressException e) { // Wait for rebalance to complete Thread.sleep(1000); consumer.poll(Duration.ofSeconds(1)); // Trigger rebalance join consumer.commitSync(); }
```
Handle RebalanceInProgressException in the consumer loop by catching the exception and retrying after a short delay:

try {
    consumer.commitSync();
} catch (RebalanceInProgressException e) {
    // Wait for rebalance to complete
    Thread.sleep(1000);
    consumer.poll(Duration.ofSeconds(1));  // Trigger rebalance join
    consumer.commitSync();
}
```

中文步骤

Increase max.poll.interval.ms to allow more time for processing between polls, reducing the chance of rebalance being triggered:

Properties props = new Properties();
props.put("max.poll.interval.ms", 600000);  // 10 minutes
props.put("max.poll.records", 500);  // Fewer records per poll

And ensure the consumer processes records quickly or uses async processing.

Handle RebalanceInProgressException in the consumer loop by catching the exception and retrying after a short delay:

try {
    consumer.commitSync();
} catch (RebalanceInProgressException e) {
    // Wait for rebalance to complete
    Thread.sleep(1000);
    consumer.poll(Duration.ofSeconds(1));  // Trigger rebalance join
    consumer.commitSync();
}

Dead Ends

Common approaches that don't work:

Increase the session.timeout.ms to a very high value 80% fail
Session timeout controls heartbeat detection, not rebalance duration; a high value may delay failure detection but does not prevent rebalance conflicts.
Set the consumer to use static group membership 70% fail
Static membership reduces rebalance frequency but does not eliminate it; rebalance can still be triggered by coordinator changes or partition reassignments.