huggingface runtime_error ai_generated partial

RuntimeError: Placeholder storage has not been allocated on MPS device

ID: huggingface/mps-fp16-cast-error

Also available as: JSON · Markdown · 中文

80%Fix Rate

85%Confidence

1Evidence

2023-06-15First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
transformers>=4.30.0	active	—	—	—
torch>=2.0.0	active	—	—	—
macOS>=13.0	active	—	—	—

Model weights or tensors are being cast to float16 on Apple MPS backend which does not fully support FP16, causing allocation failure.

generic

模型权重或张量在Apple MPS后端上被转换为float16，但MPS不完全支持FP16，导致分配失败。

85% success Load model with torch_dtype=torch.float32 and use model.to('mps') without half precision.
```
Load model with torch_dtype=torch.float32 and use model.to('mps') without half precision.
```
70% success Set environment variable PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable MPS memory allocation optimization.
```
Set environment variable PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable MPS memory allocation optimization.
```
95% success Use CPU fallback by setting device='cpu' and use float16 with CPU if memory is a concern.
```
Use CPU fallback by setting device='cpu' and use float16 with CPU if memory is a concern.
```

Load model with torch_dtype=torch.float32 and use model.to('mps') without half precision.

Set environment variable PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable MPS memory allocation optimization.

Use CPU fallback by setting device='cpu' and use float16 with CPU if memory is a concern.

Common approaches that don't work:

95% fail
MPS backend has limited FP16 support; explicit FP16 casting triggers the error.
90% fail
Half-precision conversion on MPS is not fully implemented and causes memory allocation errors.