data
encoding_error
ai_generated
true
CSV file with UTF-8 BOM causes silent data corruption in Excel on Windows
ID: data/csv-encoding-utf8-with-bom-silent-corruption
92%Fix Rate
90%Confidence
1Evidence
2023-05-18First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| Excel 2019 | active | — | — | — |
| Excel 365 | active | — | — | — |
| Excel 2021 | active | — | — | — |
Root Cause
Excel on Windows interprets BOM-less UTF-8 CSV files as ANSI (Windows-1252), corrupting non-ASCII characters. Adding BOM fixes encoding detection but may cause issues with other tools that don't expect BOM.
generic中文
Windows 上的 Excel 将无 BOM 的 UTF-8 CSV 文件解释为 ANSI(Windows-1252),损坏非 ASCII 字符。添加 BOM 可修复编码检测,但可能导致其他不期望 BOM 的工具出现问题。
Official Documentation
https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393baWorkarounds
-
95% success Add UTF-8 BOM to CSV files before opening in Excel. In Python: with open('output.csv', 'w', encoding='utf-8-sig') as f: writer = csv.writer(f); writer.writerows(data). The 'utf-8-sig' encoding adds BOM automatically. In command line: sed '1s/^/\xef\xbb\xbf/' input.csv > output.csv
Add UTF-8 BOM to CSV files before opening in Excel. In Python: with open('output.csv', 'w', encoding='utf-8-sig') as f: writer = csv.writer(f); writer.writerows(data). The 'utf-8-sig' encoding adds BOM automatically. In command line: sed '1s/^/\xef\xbb\xbf/' input.csv > output.csv -
90% success Use Excel's 'Get Data from Text/CSV' feature instead of double-clicking: Data tab > Get Data > From File > From Text/CSV. Then choose UTF-8 encoding explicitly in the import wizard.
Use Excel's 'Get Data from Text/CSV' feature instead of double-clicking: Data tab > Get Data > From File > From Text/CSV. Then choose UTF-8 encoding explicitly in the import wizard.
中文步骤
Add UTF-8 BOM to CSV files before opening in Excel. In Python: with open('output.csv', 'w', encoding='utf-8-sig') as f: writer = csv.writer(f); writer.writerows(data). The 'utf-8-sig' encoding adds BOM automatically. In command line: sed '1s/^/\xef\xbb\xbf/' input.csv > output.csvUse Excel's 'Get Data from Text/CSV' feature instead of double-clicking: Data tab > Get Data > From File > From Text/CSV. Then choose UTF-8 encoding explicitly in the import wizard.
Dead Ends
Common approaches that don't work:
-
55% fail
This option adds BOM but also changes the file format slightly (e.g., quoting rules), and the file may not be re-importable correctly.
-
80% fail
UTF-16 is not widely supported by CSV parsers and will cause issues with most data processing tools. It also doubles file size.