kafka
runtime_error
ai_generated
true
org.apache.kafka.common.errors.RebalanceInProgressException: The group is rebalancing, so a rebalance is already in progress.
ID: kafka/group-rebalance-timeout
75%Fix Rate
82%Confidence
1Evidence
2023-03-10First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| Kafka 2.8.0 | active | — | — | — |
| Kafka 3.0.0 | active | — | — | — |
| Kafka 3.4.0 | active | — | — | — |
| Kafka 3.6.0 | active | — | — | — |
Root Cause
A consumer request (like offset commit or join group) was made while a consumer group rebalance was already in progress, causing the request to be rejected.
generic中文
在消费者组重新平衡正在进行时,消费者请求(如偏移提交或加入组)被发出,导致请求被拒绝。
Official Documentation
https://kafka.apache.org/documentation/#consumerconfigs_max.poll.interval.msWorkarounds
-
80% success Increase max.poll.interval.ms to allow more time for processing between polls, reducing the chance of rebalance being triggered: Properties props = new Properties(); props.put("max.poll.interval.ms", 600000); // 10 minutes props.put("max.poll.records", 500); // Fewer records per poll And ensure the consumer processes records quickly or uses async processing.
Increase max.poll.interval.ms to allow more time for processing between polls, reducing the chance of rebalance being triggered: Properties props = new Properties(); props.put("max.poll.interval.ms", 600000); // 10 minutes props.put("max.poll.records", 500); // Fewer records per poll And ensure the consumer processes records quickly or uses async processing. -
75% success Handle RebalanceInProgressException in the consumer loop by catching the exception and retrying after a short delay: try { consumer.commitSync(); } catch (RebalanceInProgressException e) { // Wait for rebalance to complete Thread.sleep(1000); consumer.poll(Duration.ofSeconds(1)); // Trigger rebalance join consumer.commitSync(); }
Handle RebalanceInProgressException in the consumer loop by catching the exception and retrying after a short delay: try { consumer.commitSync(); } catch (RebalanceInProgressException e) { // Wait for rebalance to complete Thread.sleep(1000); consumer.poll(Duration.ofSeconds(1)); // Trigger rebalance join consumer.commitSync(); }
中文步骤
Increase max.poll.interval.ms to allow more time for processing between polls, reducing the chance of rebalance being triggered: Properties props = new Properties(); props.put("max.poll.interval.ms", 600000); // 10 minutes props.put("max.poll.records", 500); // Fewer records per poll And ensure the consumer processes records quickly or uses async processing.Handle RebalanceInProgressException in the consumer loop by catching the exception and retrying after a short delay: try { consumer.commitSync(); } catch (RebalanceInProgressException e) { // Wait for rebalance to complete Thread.sleep(1000); consumer.poll(Duration.ofSeconds(1)); // Trigger rebalance join consumer.commitSync(); }
Dead Ends
Common approaches that don't work:
-
Increase the session.timeout.ms to a very high value
80% fail
Session timeout controls heartbeat detection, not rebalance duration; a high value may delay failure detection but does not prevent rebalance conflicts.
-
Set the consumer to use static group membership
70% fail
Static membership reduces rebalance frequency but does not eliminate it; rebalance can still be triggered by coordinator changes or partition reassignments.