kafka runtime_error ai_generated true

org.apache.kafka.common.errors.TransactionalCoordinatorFencedException: The transactional coordinator with epoch 5 has been fenced

ID: kafka/transaction-coordinator-fenced

Also available as: JSON · Markdown · 中文
82%Fix Rate
88%Confidence
1Evidence
2024-01-10First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
kafka 3.5.0 active
kafka 3.6.2 active
kafka 3.7.0 active

Root Cause

A new transactional coordinator has taken over, fencing the old coordinator due to leader election or broker failure.

generic

中文

新的事务协调器接管,由于领导者选举或代理故障而隔离了旧协调器。

Official Documentation

https://kafka.apache.org/documentation/#transaction_config

Workarounds

  1. 90% success Restart the transactional producer to re-initialize the coordinator connection: 'producer.initTransactions();' in code, or restart the application.
    Restart the transactional producer to re-initialize the coordinator connection: 'producer.initTransactions();' in code, or restart the application.
  2. 75% success Ensure all brokers have consistent transaction.state.log.replication.factor and min.insync.replicas settings, then restart the broker with the highest epoch: 'bin/kafka-server-stop.sh && bin/kafka-server-start.sh config/server.properties'
    Ensure all brokers have consistent transaction.state.log.replication.factor and min.insync.replicas settings, then restart the broker with the highest epoch: 'bin/kafka-server-stop.sh && bin/kafka-server-start.sh config/server.properties'

中文步骤

  1. Restart the transactional producer to re-initialize the coordinator connection: 'producer.initTransactions();' in code, or restart the application.
  2. Ensure all brokers have consistent transaction.state.log.replication.factor and min.insync.replicas settings, then restart the broker with the highest epoch: 'bin/kafka-server-stop.sh && bin/kafka-server-start.sh config/server.properties'

Dead Ends

Common approaches that don't work:

  1. 85% fail

    Manually reassigning partitions without checking coordinator health causes repeated fencing.

  2. 70% fail

    Disabling idempotent producer to avoid fencing breaks exactly-once semantics and may cause duplicates.