data validation_error ai_generated true

JSON Schema validation fails: format 'email' rejects valid emails with plus signs or international characters

ID: data/json-schema-format-email-validation-failure

Also available as: JSON · Markdown · 中文
85%Fix Rate
90%Confidence
1Evidence
2024-01-05First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
JSON Schema Draft-07 active
Ajv 8.12.0 active
Python jsonschema 4.20.0 active
OpenAPI 3.1.0 active

Root Cause

JSON Schema's built-in 'email' format validation uses a regex that does not support modern email features (e.g., plus addressing, internationalized domain names), causing false negatives.

generic

中文

JSON Schema内置的'email'格式验证使用不支持现代电子邮件功能(如加号地址、国际化域名)的正则表达式,导致误报。

Official Documentation

https://json-schema.org/understanding-json-schema/reference/string.html#format

Workarounds

  1. 90% success Use a custom format validator that follows RFC 5321/5322. In Python with jsonschema, register a custom validator: `import re; email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'; jsonschema.validators.create(meta_schema=..., validators={'format': lambda v, f, s: ...})`
    Use a custom format validator that follows RFC 5321/5322. In Python with jsonschema, register a custom validator: `import re; email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'; jsonschema.validators.create(meta_schema=..., validators={'format': lambda v, f, s: ...})`
  2. 85% success In Ajv (JavaScript), use `ajv.addFormat('email', /^[^\s@]+@[^\s@]+\.[^\s@]+$/)` to override the default email format with a more permissive regex
    In Ajv (JavaScript), use `ajv.addFormat('email', /^[^\s@]+@[^\s@]+\.[^\s@]+$/)` to override the default email format with a more permissive regex
  3. 90% success Switch to a validation library that supports RFC-compliant email validation, such as 'email-validator' in Python, and apply it as a custom keyword in the schema
    Switch to a validation library that supports RFC-compliant email validation, such as 'email-validator' in Python, and apply it as a custom keyword in the schema

中文步骤

  1. Use a custom format validator that follows RFC 5321/5322. In Python with jsonschema, register a custom validator: `import re; email_regex = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'; jsonschema.validators.create(meta_schema=..., validators={'format': lambda v, f, s: ...})`
  2. In Ajv (JavaScript), use `ajv.addFormat('email', /^[^\s@]+@[^\s@]+\.[^\s@]+$/)` to override the default email format with a more permissive regex
  3. Switch to a validation library that supports RFC-compliant email validation, such as 'email-validator' in Python, and apply it as a custom keyword in the schema

Dead Ends

Common approaches that don't work:

  1. 70% fail

    Writing a custom regex that correctly validates all valid email addresses is extremely complex and often introduces new bugs (e.g., rejecting valid addresses or allowing invalid ones).

  2. 80% fail

    This disables all email validation, allowing any string to pass, which defeats the purpose of schema validation and may lead to downstream errors.

  3. 60% fail

    Not all validators support 'idn-email', and it still may reject plus signs or other common email features.