huggingface config_error ai_generated true

ValueError: You passed `quantization_config` with `bnb_4bit_compute_dtype=torch.float16` but the model weights are loaded in `torch.bfloat16`. This is incompatible.

ID: huggingface/quantization-dtype-mismatch-bnb

Also available as: JSON · Markdown · 中文
90%Fix Rate
87%Confidence
1Evidence
2024-03-10First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
bitsandbytes 0.43.0 active
transformers 4.44.0 active
torch 2.3.0 active

Root Cause

Bitsandbytes quantization config compute dtype does not match the model's native weight dtype, causing a type cast failure during forward pass.

generic

中文

Bitsandbytes 量化配置的计算数据类型与模型的原始权重数据类型不匹配,导致前向传播时类型转换失败。

Official Documentation

https://huggingface.co/docs/bitsandbytes/en/quantization#dtype-mismatch

Workarounds

  1. 95% success Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
    Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
  2. 80% success Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.
    Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.

中文步骤

  1. Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
  2. Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.

Dead Ends

Common approaches that don't work:

  1. 50% fail

    If the model was saved in bfloat16, casting to float16 may cause overflow or underflow; the root mismatch remains if model weights are bfloat16.

  2. 100% fail

    Bitsandbytes checks for exact dtype match; this combination will still raise the same error.