huggingface
config_error
ai_generated
true
RuntimeError: gradient_checkpointing requires use_cache=False
ID: huggingface/gradient-checkpointing-disable-error
93%Fix Rate
90%Confidence
1Evidence
2023-03-05First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| transformers>=4.25.0 | active | — | — | — |
Root Cause
Gradient checkpointing is incompatible with the key-value cache used during generation; use_cache must be disabled to enable gradient checkpointing.
generic中文
梯度检查点与生成过程中使用的键值缓存不兼容;必须禁用use_cache才能启用梯度检查点。
Official Documentation
https://huggingface.co/docs/transformers/en/perf_train_gpu_one#gradient-checkpointingWorkarounds
-
95% success Disable use_cache before enabling gradient checkpointing.
Disable use_cache before enabling gradient checkpointing.
-
90% success Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.
Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.
中文步骤
Disable use_cache before enabling gradient checkpointing.
Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.
Dead Ends
Common approaches that don't work:
-
98% fail
The two features are mutually exclusive during training; use_cache is only for inference.
-
95% fail
The model configuration is checked at runtime; manual deletion does not bypass the check.