org.apache.avro.AvroTypeException: Field email type:STRING pos:12 not set and has no default value data schema_error ai_generated true

Avro 反序列化失败:字段 'email' 没有默认值且在写入器模式中缺失

Avro deserialization fails: field 'email' has no default and is missing in writer schema

ID: data/avro-schema-field-missing-default

其他格式: JSON · Markdown 中文 · English
85%修复率
90%置信度
1证据数
2023-05-22首次发现

版本兼容性

版本状态引入弃用备注
Apache Avro 1.11.0 active
Confluent Schema Registry 7.5.0 active

根因分析

Avro 读取器模式中有一个没有默认值的字段,该字段在写入器模式中不存在,违反了前向兼容性。

English

Avro reader schema has a field with no default value that is not present in the writer schema, violating forward compatibility.

generic

官方文档

https://avro.apache.org/docs/current/spec.html#Schema+Resolution

解决方案

  1. Add a default value to the reader schema field: e.g., {"name": "email", "type": "string", "default": ""}
  2. Use Avro's projection API to filter out unknown fields during deserialization: SpecificDatumReader<MyClass> reader = new SpecificDatumReader<>(writerSchema, readerSchema, new NoMatchFieldAction());

无效尝试

常见但无效的做法:

  1. Adding the missing field to the writer schema 70% 失败

    This requires coordination with all data producers and may not be feasible for historical data.

  2. Setting the field as 'optional' using union [null, string] in reader schema 60% 失败

    This changes the field type and may break downstream consumers expecting a non-nullable string.