pytorch data_error ai_generated true

ValueError: optimizer state dict mismatch: loaded state dict contains parameters that are not in the current optimizer. Expected keys: ['param_groups', 'state']. Got: ['param_groups', 'state', 'extra_key']

ID: pytorch/optimizer-state-dict-mismatch

Also available as: JSON · Markdown · 中文
80%Fix Rate
84%Confidence
1Evidence
2023-04-22First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
pytorch>=1.9 active
python>=3.7 active

Root Cause

The saved optimizer state dict contains keys that do not match the current optimizer's parameter groups, often due to a change in model architecture or optimizer configuration between save and load.

generic

中文

保存的优化器状态字典包含与当前优化器参数组不匹配的键,通常由于保存和加载之间模型架构或优化器配置发生变化。

Official Documentation

https://pytorch.org/docs/stable/optim.html#torch.optim.Optimizer.load_state_dict

Workarounds

  1. 90% success Ensure the model and optimizer are constructed identically before loading: recreate the model and optimizer with the same configuration as when the state dict was saved.
    Ensure the model and optimizer are constructed identically before loading: recreate the model and optimizer with the same configuration as when the state dict was saved.
  2. 80% success Use strict=False and then manually align parameters: `optimizer.load_state_dict(state_dict, strict=False)` then iterate over param_groups to fix mismatches.
    Use strict=False and then manually align parameters: `optimizer.load_state_dict(state_dict, strict=False)` then iterate over param_groups to fix mismatches.
  3. 75% success Implement a custom loading function that filters out unexpected keys: `filtered_dict = {k: v for k, v in state_dict.items() if k in expected_keys}; optimizer.load_state_dict(filtered_dict)`
    Implement a custom loading function that filters out unexpected keys: `filtered_dict = {k: v for k, v in state_dict.items() if k in expected_keys}; optimizer.load_state_dict(filtered_dict)`

中文步骤

  1. Ensure the model and optimizer are constructed identically before loading: recreate the model and optimizer with the same configuration as when the state dict was saved.
  2. Use strict=False and then manually align parameters: `optimizer.load_state_dict(state_dict, strict=False)` then iterate over param_groups to fix mismatches.
  3. Implement a custom loading function that filters out unexpected keys: `filtered_dict = {k: v for k, v in state_dict.items() if k in expected_keys}; optimizer.load_state_dict(filtered_dict)`

Dead Ends

Common approaches that don't work:

  1. Ignoring the error by setting strict=False in load_state_dict 60% fail

    The optimizer may silently skip mismatched parameters, leading to incorrect training state.

  2. Re-saving the optimizer state dict without changes 90% fail

    The mismatch persists because the underlying architecture changed.

  3. Manually editing the state dict file to remove extra keys 80% fail

    Editing state dict files manually is error-prone and may corrupt the data.