# org.apache.kafka.common.errors.ReplicaNotAvailableException: Replica for partition my_topic-0 is not available on broker 2

- **ID:** `kafka/replica-not-available-on-fetch`
- **Domain:** kafka
- **Category:** runtime_error
- **Verification:** ai_generated
- **Fix Rate:** 85%

## Root Cause

A follower replica is not fully caught up with the leader and cannot serve fetch requests, often due to replication lag or the replica being offline.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| kafka_2.13-3.4.0 | active | — | — |
| kafka_2.13-3.5.1 | active | — | — |
| kafka_2.13-3.6.0 | active | — | — |

## Workarounds

1. **Check the replication lag using `kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group my-group --describe` and verify the replica is in sync. If lag is high, increase `replica.fetch.max.bytes` and `num.replica.fetchers` on the broker.** (85% success)
   ```
   Check the replication lag using `kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group my-group --describe` and verify the replica is in sync. If lag is high, increase `replica.fetch.max.bytes` and `num.replica.fetchers` on the broker.
   ```
2. **Restart the broker hosting the unavailable replica (broker 2) to force re-sync with the leader. Example: `systemctl restart kafka` on broker 2.** (80% success)
   ```
   Restart the broker hosting the unavailable replica (broker 2) to force re-sync with the leader. Example: `systemctl restart kafka` on broker 2.
   ```
3. **If the replica is permanently stuck, reassign the partition to a different broker using `kafka-reassign-partitions.sh` with a custom reassignment JSON.** (90% success)
   ```
   If the replica is permanently stuck, reassign the partition to a different broker using `kafka-reassign-partitions.sh` with a custom reassignment JSON.
   ```

## Dead Ends

- **** — This reduces durability but does not make the replica available; the follower still lags and cannot serve fetches. (85% fail)
- **** — This controls fetch size, not lag; the replica is not available due to being out of sync, not due to fetch limits. (80% fail)
- **** — This removes the replica entirely, causing data loss and requiring re-replication, which may make it even less available. (95% fail)
