huggingface config_error ai_generated true

运行时错误:梯度检查点要求use_cache=False

RuntimeError: gradient_checkpointing requires use_cache=False

ID: huggingface/gradient-checkpointing-disable-error

其他格式: JSON · Markdown 中文 · English
93%修复率
90%置信度
1证据数
2023-03-05首次发现

版本兼容性

版本状态引入弃用备注
transformers>=4.25.0 active

根因分析

梯度检查点与生成过程中使用的键值缓存不兼容;必须禁用use_cache才能启用梯度检查点。

English

Gradient checkpointing is incompatible with the key-value cache used during generation; use_cache must be disabled to enable gradient checkpointing.

generic

官方文档

https://huggingface.co/docs/transformers/en/perf_train_gpu_one#gradient-checkpointing

解决方案

  1. Disable use_cache before enabling gradient checkpointing.
  2. Use the Trainer with gradient_checkpointing=True argument, which handles this automatically.

无效尝试

常见但无效的做法:

  1. 98% 失败

    The two features are mutually exclusive during training; use_cache is only for inference.

  2. 95% 失败

    The model configuration is checked at runtime; manual deletion does not bypass the check.