huggingface config_error ai_generated true

ValueError: You passed `quantization_config` with `bnb_4bit_compute_dtype=torch.float16` but the model weights are loaded in `torch.bfloat16`. This is incompatible.

ID: huggingface/quantization-dtype-mismatch-bnb

Also available as: JSON · Markdown · 中文

90%Fix Rate

87%Confidence

1Evidence

2024-03-10First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
bitsandbytes 0.43.0	active	—	—	—
transformers 4.44.0	active	—	—	—
torch 2.3.0	active	—	—	—

Root Cause

Bitsandbytes quantization config compute dtype does not match the model's native weight dtype, causing a type cast failure during forward pass.

generic

中文

Bitsandbytes 量化配置的计算数据类型与模型的原始权重数据类型不匹配，导致前向传播时类型转换失败。

Official Documentation

https://huggingface.co/docs/bitsandbytes/en/quantization#dtype-mismatch

Workarounds

95% success Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
```
Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.
```
80% success Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.
```
Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.
```

中文步骤

Set `bnb_4bit_compute_dtype=torch.bfloat16` to match the model weight dtype: `quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)`.

Load model with `torch_dtype=torch.float16` and set `bnb_4bit_compute_dtype=torch.float16`.

Dead Ends

Common approaches that don't work:

50% fail
If the model was saved in bfloat16, casting to float16 may cause overflow or underflow; the root mismatch remains if model weights are bfloat16.
100% fail
Bitsandbytes checks for exact dtype match; this combination will still raise the same error.