java encoding_error ai_generated true

error: unmappable character (0x80) for encoding UTF-8

ID: java/unmappable-character-encoding

Also available as: JSON · Markdown · 中文
90%Fix Rate
85%Confidence
1Evidence
2024-01-10First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
Java 8 active
Java 11 active
Java 17 active
Java 21 active

Root Cause

The Java source file contains a byte sequence that is not valid in the UTF-8 encoding specified for the compiler, often due to a non-UTF-8 character (e.g., from Windows-1252 or ISO-8859-1) being present in a string literal or comment.

generic

中文

Java 源文件包含编译器指定的 UTF-8 编码中无效的字节序列,通常是由于字符串字面量或注释中存在非 UTF-8 字符(例如来自 Windows-1252 或 ISO-8859-1)。

Official Documentation

https://docs.oracle.com/en/java/javase/17/docs/specs/man/javac.html#options

Workarounds

  1. 90% success Specify the correct source encoding to javac using the -encoding flag. If the file is actually in Windows-1252, use -encoding Cp1252.
    Specify the correct source encoding to javac using the -encoding flag. If the file is actually in Windows-1252, use -encoding Cp1252.
  2. 85% success Convert the source file to UTF-8 using a tool like iconv or a text editor that supports encoding conversion.
    Convert the source file to UTF-8 using a tool like iconv or a text editor that supports encoding conversion.
  3. 75% success Use native2ascii to escape the unmappable character as a Unicode escape sequence.
    Use native2ascii to escape the unmappable character as a Unicode escape sequence.

中文步骤

  1. Specify the correct source encoding to javac using the -encoding flag. If the file is actually in Windows-1252, use -encoding Cp1252.
  2. Convert the source file to UTF-8 using a tool like iconv or a text editor that supports encoding conversion.
  3. Use native2ascii to escape the unmappable character as a Unicode escape sequence.

Dead Ends

Common approaches that don't work:

  1. 95% fail

    Changing the system locale does not affect the javac encoding; the compiler encoding must be explicitly set.

  2. 90% fail

    Adding -Dfile.encoding=UTF-8 to JVM arguments does not affect javac compilation encoding.

  3. 60% fail

    Removing the character without understanding its origin may break the intended functionality (e.g., a special symbol in a string).