data serialization_error ai_generated true

Avro deserialization fails when union field has null as first element instead of last

ID: data/avro-union-null-ordering

Also available as: JSON · Markdown · 中文
80%Fix Rate
83%Confidence
1Evidence
2023-11-05First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
Apache Avro 1.11.0 active
Confluent Schema Registry 7.4.0 active
Kafka 3.5.0 active

Root Cause

Some Avro libraries expect null to be the first element in a union type (e.g., ['null', 'string']), while others expect it last, causing schema compatibility issues.

generic

中文

某些Avro库期望null是联合类型中的第一个元素(例如['null', 'string']),而其他库期望它在最后,导致架构兼容性问题。

Official Documentation

https://avro.apache.org/docs/1.11.0/spec.html#Unions

Workarounds

  1. 95% success Ensure all Avro schemas use the same union ordering convention: always put null first: {"type": ["null", "string"]}
    Ensure all Avro schemas use the same union ordering convention: always put null first: {"type": ["null", "string"]}
  2. 70% success Use a custom deserializer that reorders union types: GenericDatumReader<GenericRecord> reader = new GenericDatumReader<>(writerSchema, readerSchema) { @Override protected Object read(Object old, Decoder in) throws IOException { return super.read(old, in); } };
    Use a custom deserializer that reorders union types: GenericDatumReader<GenericRecord> reader = new GenericDatumReader<>(writerSchema, readerSchema) { @Override protected Object read(Object old, Decoder in) throws IOException { return super.read(old, in); } };

中文步骤

  1. Ensure all Avro schemas use the same union ordering convention: always put null first: {"type": ["null", "string"]}
  2. Use a custom deserializer that reorders union types: GenericDatumReader<GenericRecord> reader = new GenericDatumReader<>(writerSchema, readerSchema) { @Override protected Object read(Object old, Decoder in) throws IOException { return super.read(old, in); } };

Dead Ends

Common approaches that don't work:

  1. Setting compatibility to NONE in schema registry 80% fail

    Changing schema registry compatibility type does not fix the union ordering issue.

  2. Modifying the data to include null values in a different order 75% fail

    The null position is determined by the schema, not the data payload.