org.apache.kafka.common.errors.StaleMemberEpochException:成员纪元5已被组协调器隔离
org.apache.kafka.common.errors.StaleMemberEpochException: The member epoch 5 has been fenced by the group coordinator
ID: kafka/consumer-group-stale-metadata
根因分析
消费者成员的纪元(代)已过期,因为组协调器已重新平衡或消费者的会话超时,导致成员被隔离。
English
Consumer member's epoch (generation) is outdated because the group coordinator has rebalanced or the consumer's session timed out, leading to a fenced member.
官方文档
https://kafka.apache.org/documentation/#consumer_group_rebalancing解决方案
-
Implement a consumer rebalance listener to detect rebalances and reset state. For Java consumers: `consumer.subscribe(Collections.singletonList(topic), new ConsumerRebalanceListener() { ... })` -
Increase 'heartbeat.interval.ms' to a value lower than 'session.timeout.ms' (e.g., set heartbeat to 1000 ms and session timeout to 10000 ms) to ensure timely heartbeats and reduce session expiry.
-
If using static group membership, ensure 'group.instance.id' is unique per consumer and stable across restarts. Example: `props.put(ConsumerConfig.GROUP_INSTANCE_ID_CONFIG, "consumer-1");`
无效尝试
常见但无效的做法:
-
50% 失败
This can make the consumer unresponsive for longer periods, worsening rebalance issues and causing the group to stall; it does not prevent epoch fencing due to rebalances.
-
40% 失败
Static membership only prevents unnecessary rebalances during brief disconnections; if the consumer actually fails or is fenced, the epoch will still be stale.
-
70% 失败
This causes a full rebalance and may temporarily resolve the error, but the underlying cause (e.g., network issues, slow processing) remains, so the error will recur.