# Parquet decimal precision overflow when reading into pandas

- **ID:** `data/parquet-decimal-overflow`
- **Domain:** data
- **Category:** type_error
- **Error Code:** `ArrowNotImplementedError`
- **Verification:** ai_generated
- **Fix Rate:** 85%

## Root Cause

Parquet files store decimals with arbitrary precision (e.g., decimal(38,10)), but pandas converts them to float64 by default, causing overflow or precision loss for values exceeding float64 capacity.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| pyarrow 12.0.0 | active | — | — |
| pyarrow 14.0.1 | active | — | — |
| pandas 2.2.0 | active | — | — |

## Workarounds

1. **Read with pyarrow and specify decimal type: `import pyarrow.parquet as pq; table = pq.read_table('data.parquet'); from decimal import Decimal; df = table.to_pandas(types_mapper={pa.decimal128(38,10): Decimal})`** (90% success)
   ```
   Read with pyarrow and specify decimal type: `import pyarrow.parquet as pq; table = pq.read_table('data.parquet'); from decimal import Decimal; df = table.to_pandas(types_mapper={pa.decimal128(38,10): Decimal})`
   ```
2. **Use pandas read_parquet with dtype_backend='pyarrow': `df = pd.read_parquet('data.parquet', dtype_backend='pyarrow')`** (82% success)
   ```
   Use pandas read_parquet with dtype_backend='pyarrow': `df = pd.read_parquet('data.parquet', dtype_backend='pyarrow')`
   ```

## Dead Ends

- **** — This only preserves pandas-specific metadata like index names; it does not change the decimal-to-float conversion behavior. (70% fail)
- **** — The overflow already occurred during reading; the string representation will show the truncated/rounded value. (85% fail)
- **** — Fastparquet has the same limitation; it also converts decimals to float64 by default. (75% fail)
