huggingface config_error ai_generated true

RuntimeError: gradient_checkpointing requires use_cache=False

ID: huggingface/gradient-checkpointing-disable-error

Also available as: JSON · Markdown · 中文
93%Fix Rate
90%Confidence
1Evidence
2023-03-05First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
transformers>=4.25.0 active

Root Cause

Gradient checkpointing is incompatible with the key-value cache used during generation; use_cache must be disabled to enable gradient checkpointing.

generic

中文

梯度检查点与生成过程中使用的键值缓存不兼容;必须禁用use_cache才能启用梯度检查点。

Official Documentation

https://huggingface.co/docs/transformers/en/perf_train_gpu_one#gradient-checkpointing

Workarounds

  1. 95% success Disable use_cache before enabling gradient checkpointing.
    Disable use_cache before enabling gradient checkpointing.
  2. 90% success Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.
    Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.

中文步骤

  1. Disable use_cache before enabling gradient checkpointing.
  2. Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.

Dead Ends

Common approaches that don't work:

  1. 98% fail

    The two features are mutually exclusive during training; use_cache is only for inference.

  2. 95% fail

    The model configuration is checked at runtime; manual deletion does not bypass the check.