# ValueError: optimizer state dict mismatch: loaded state dict contains parameters that are not in the current optimizer. Expected keys: ['param_groups', 'state']. Got: ['param_groups', 'state', 'extra_key']

- **ID:** `pytorch/optimizer-state-dict-mismatch`
- **Domain:** pytorch
- **Category:** data_error
- **Verification:** ai_generated
- **Fix Rate:** 80%

## Root Cause

The saved optimizer state dict contains keys that do not match the current optimizer's parameter groups, often due to a change in model architecture or optimizer configuration between save and load.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| pytorch>=1.9 | active | — | — |
| python>=3.7 | active | — | — |

## Workarounds

1. **Ensure the model and optimizer are constructed identically before loading: recreate the model and optimizer with the same configuration as when the state dict was saved.** (90% success)
   ```
   Ensure the model and optimizer are constructed identically before loading: recreate the model and optimizer with the same configuration as when the state dict was saved.
   ```
2. **Use strict=False and then manually align parameters: `optimizer.load_state_dict(state_dict, strict=False)` then iterate over param_groups to fix mismatches.** (80% success)
   ```
   Use strict=False and then manually align parameters: `optimizer.load_state_dict(state_dict, strict=False)` then iterate over param_groups to fix mismatches.
   ```
3. **Implement a custom loading function that filters out unexpected keys: `filtered_dict = {k: v for k, v in state_dict.items() if k in expected_keys}; optimizer.load_state_dict(filtered_dict)`** (75% success)
   ```
   Implement a custom loading function that filters out unexpected keys: `filtered_dict = {k: v for k, v in state_dict.items() if k in expected_keys}; optimizer.load_state_dict(filtered_dict)`
   ```

## Dead Ends

- **Ignoring the error by setting strict=False in load_state_dict** — The optimizer may silently skip mismatched parameters, leading to incorrect training state. (60% fail)
- **Re-saving the optimizer state dict without changes** — The mismatch persists because the underlying architecture changed. (90% fail)
- **Manually editing the state dict file to remove extra keys** — Editing state dict files manually is error-prone and may corrupt the data. (80% fail)
