kafka runtime_error ai_generated true

org.apache.kafka.common.errors.ReplicaNotAvailableException: 分区 my_topic-0 的副本在代理 2 上不可用

org.apache.kafka.common.errors.ReplicaNotAvailableException: Replica for partition my_topic-0 is not available on broker 2

ID: kafka/replica-not-available-on-fetch

其他格式: JSON · Markdown 中文 · English
85%修复率
87%置信度
1证据数
2023-08-01首次发现

版本兼容性

版本状态引入弃用备注
kafka_2.13-3.4.0 active
kafka_2.13-3.5.1 active
kafka_2.13-3.6.0 active

根因分析

跟随者副本未完全跟上领导者,无法提供获取请求,通常是由于复制延迟或副本离线。

English

A follower replica is not fully caught up with the leader and cannot serve fetch requests, often due to replication lag or the replica being offline.

generic

官方文档

https://kafka.apache.org/documentation/#replication

解决方案

  1. Check the replication lag using `kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group my-group --describe` and verify the replica is in sync. If lag is high, increase `replica.fetch.max.bytes` and `num.replica.fetchers` on the broker.
  2. Restart the broker hosting the unavailable replica (broker 2) to force re-sync with the leader. Example: `systemctl restart kafka` on broker 2.
  3. If the replica is permanently stuck, reassign the partition to a different broker using `kafka-reassign-partitions.sh` with a custom reassignment JSON.

无效尝试

常见但无效的做法:

  1. 85% 失败

    This reduces durability but does not make the replica available; the follower still lags and cannot serve fetches.

  2. 80% 失败

    This controls fetch size, not lag; the replica is not available due to being out of sync, not due to fetch limits.

  3. 95% 失败

    This removes the replica entirely, causing data loss and requiring re-replication, which may make it even less available.