org.apache.avro.AvroTypeException: Field email type:STRING pos:12 not set and has no default value data schema_error ai_generated true

Avro deserialization fails: field 'email' has no default and is missing in writer schema

ID: data/avro-schema-field-missing-default

Also available as: JSON · Markdown · 中文
85%Fix Rate
90%Confidence
1Evidence
2023-05-22First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
Apache Avro 1.11.0 active
Confluent Schema Registry 7.5.0 active

Root Cause

Avro reader schema has a field with no default value that is not present in the writer schema, violating forward compatibility.

generic

中文

Avro 读取器模式中有一个没有默认值的字段,该字段在写入器模式中不存在,违反了前向兼容性。

Official Documentation

https://avro.apache.org/docs/current/spec.html#Schema+Resolution

Workarounds

  1. 95% success Add a default value to the reader schema field: e.g., {"name": "email", "type": "string", "default": ""}
    Add a default value to the reader schema field: e.g., {"name": "email", "type": "string", "default": ""}
  2. 80% success Use Avro's projection API to filter out unknown fields during deserialization: SpecificDatumReader<MyClass> reader = new SpecificDatumReader<>(writerSchema, readerSchema, new NoMatchFieldAction());
    Use Avro's projection API to filter out unknown fields during deserialization: SpecificDatumReader<MyClass> reader = new SpecificDatumReader<>(writerSchema, readerSchema, new NoMatchFieldAction());

中文步骤

  1. Add a default value to the reader schema field: e.g., {"name": "email", "type": "string", "default": ""}
  2. Use Avro's projection API to filter out unknown fields during deserialization: SpecificDatumReader<MyClass> reader = new SpecificDatumReader<>(writerSchema, readerSchema, new NoMatchFieldAction());

Dead Ends

Common approaches that don't work:

  1. Adding the missing field to the writer schema 70% fail

    This requires coordination with all data producers and may not be feasible for historical data.

  2. Setting the field as 'optional' using union [null, string] in reader schema 60% fail

    This changes the field type and may break downstream consumers expecting a non-nullable string.