elasticsearch data_error ai_generated true

MapperParsingException: failed to parse field [created_at] of type [date] in document with id 'doc123'. Preview of field's value: '2024-13-01'

ID: elasticsearch/mapping-conflict-on-date-format

Also available as: JSON · Markdown · 中文
90%Fix Rate
87%Confidence
1Evidence
2024-09-20First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
elasticsearch 7.17 active
elasticsearch 8.11 active
elasticsearch 8.12 active

Root Cause

The field value does not match the expected date format defined in the mapping, such as an invalid month or a mismatched timestamp pattern.

generic

中文

字段值与映射中定义的预期日期格式不匹配,例如无效月份或时间戳模式不匹配。

Official Documentation

https://www.elastic.co/guide/en/elasticsearch/reference/current/date.html

Workarounds

  1. 90% success Update the index mapping to accept multiple date formats: PUT /my_index/_mapping { "properties": { "created_at": { "type": "date", "format": "yyyy-MM-dd||yyyy/MM/dd||epoch_millis" } } }
    Update the index mapping to accept multiple date formats: PUT /my_index/_mapping { "properties": { "created_at": { "type": "date", "format": "yyyy-MM-dd||yyyy/MM/dd||epoch_millis" } } }
  2. 85% success Use an ingest pipeline to preprocess the date field before indexing: PUT _ingest/pipeline/date_fix { "description": "Fix date format", "processors": [ { "date": { "field": "created_at", "target_field": "created_at", "formats": [ "yyyy-MM-dd" ], "timezone": "UTC", "locale": "en" } } ] }
    Use an ingest pipeline to preprocess the date field before indexing: PUT _ingest/pipeline/date_fix { "description": "Fix date format", "processors": [ { "date": { "field": "created_at", "target_field": "created_at", "formats": [ "yyyy-MM-dd" ], "timezone": "UTC", "locale": "en" } } ] }
  3. 80% success Reindex the data with a script to transform the date: POST _reindex { "source": { "index": "my_index" }, "dest": { "index": "my_index_new" }, "script": { "source": "if (ctx._source.created_at == '2024-13-01') { ctx._source.created_at = '2024-01-01' }" } }
    Reindex the data with a script to transform the date: POST _reindex { "source": { "index": "my_index" }, "dest": { "index": "my_index_new" }, "script": { "source": "if (ctx._source.created_at == '2024-13-01') { ctx._source.created_at = '2024-01-01' }" } }

中文步骤

  1. 更新索引映射以接受多种日期格式:PUT /my_index/_mapping { "properties": { "created_at": { "type": "date", "format": "yyyy-MM-dd||yyyy/MM/dd||epoch_millis" } } }
  2. 使用摄取管道在索引前预处理日期字段:PUT _ingest/pipeline/date_fix { "description": "修复日期格式", "processors": [ { "date": { "field": "created_at", "target_field": "created_at", "formats": [ "yyyy-MM-dd" ], "timezone": "UTC", "locale": "en" } } ] }
  3. 使用脚本重新索引数据以转换日期:POST _reindex { "source": { "index": "my_index" }, "dest": { "index": "my_index_new" }, "script": { "source": "if (ctx._source.created_at == '2024-13-01') { ctx._source.created_at = '2024-01-01' }" } }

Dead Ends

Common approaches that don't work:

  1. 70% fail

    Converting date to text loses date-related query capabilities (e.g., range queries, date histograms) and may break existing queries.

  2. 60% fail

    This only masks the issue; malformed dates are stored as null, leading to data loss and unexpected query results.

  3. 65% fail

    If the incoming data uses a different format (e.g., 'yyyy/MM/dd'), it will still fail after re-creation.