# ValueError: The features of the dataset do not match the expected schema. Missing columns: ['text', 'label']. Extra columns: ['id', 'metadata']

- **ID:** `huggingface/dataset-features-mismatch`
- **Domain:** huggingface
- **Category:** data_error
- **Verification:** ai_generated
- **Fix Rate:** 88%

## Root Cause

The dataset loaded from Hugging Face Datasets has columns that do not match the expected schema required by the model or training script.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| datasets>=2.10.0 | active | — | — |
| transformers>=4.30.0 | active | — | — |
| python>=3.8 | active | — | — |

## Workarounds

1. **Use dataset.select_columns(['text', 'label']) to keep only required columns, then add missing columns with default values: dataset = dataset.add_column('label', [0]*len(dataset)).** (90% success)
   ```
   Use dataset.select_columns(['text', 'label']) to keep only required columns, then add missing columns with default values: dataset = dataset.add_column('label', [0]*len(dataset)).
   ```
2. **Map extra columns to required ones: dataset = dataset.map(lambda x: {'text': x['metadata'], 'label': 0}).** (85% success)
   ```
   Map extra columns to required ones: dataset = dataset.map(lambda x: {'text': x['metadata'], 'label': 0}).
   ```

## Dead Ends

- **** — Missing columns still need to be added or mapped from existing columns. (50% fail)
- **** — If the column name is misspelled, the error persists. (60% fail)
