huggingface
config_error
ai_generated
true
ValueError: You passed `quantization_config` with `bnb_4bit_compute_dtype=torch.float16` but the model weights are loaded in `torch.bfloat16`. This is incompatible.
ID: huggingface/quantization-dtype-mismatch-bnb
90%Fix Rate
87%Confidence
1Evidence
2024-03-10First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| bitsandbytes 0.43.0 | active | — | — | — |
| transformers 4.44.0 | active | — | — | — |
| torch 2.3.0 | active | — | — | — |
Root Cause
Bitsandbytes quantization config compute dtype does not match the model's native weight dtype, causing a type cast failure during forward pass.
generic中文
Bitsandbytes 量化配置的计算数据类型与模型的原始权重数据类型不匹配,导致前向传播时类型转换失败。
Official Documentation
https://huggingface.co/docs/bitsandbytes/en/quantization#dtype-mismatchWorkarounds
-
95% success Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
-
80% success Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.
Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.
中文步骤
Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.
Dead Ends
Common approaches that don't work:
-
50% fail
If the model was saved in bfloat16, casting to float16 may cause overflow or underflow; the root mismatch remains if model weights are bfloat16.
-
100% fail
Bitsandbytes checks for exact dtype match; this combination will still raise the same error.