huggingface
runtime_error
ai_generated
partial
RuntimeError: Placeholder storage has not been allocated on MPS device
ID: huggingface/mps-fp16-cast-error
80%Fix Rate
85%Confidence
1Evidence
2023-06-15First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| transformers>=4.30.0 | active | — | — | — |
| torch>=2.0.0 | active | — | — | — |
| macOS>=13.0 | active | — | — | — |
Root Cause
Model weights or tensors are being cast to float16 on Apple MPS backend which does not fully support FP16, causing allocation failure.
generic中文
模型权重或张量在Apple MPS后端上被转换为float16,但MPS不完全支持FP16,导致分配失败。
Official Documentation
https://pytorch.org/docs/stable/notes/mps.htmlWorkarounds
-
85% success Load model with torch_dtype=torch.float32 and use model.to('mps') without half precision.
Load model with torch_dtype=torch.float32 and use model.to('mps') without half precision. -
70% success Set environment variable PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable MPS memory allocation optimization.
Set environment variable PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable MPS memory allocation optimization.
-
95% success Use CPU fallback by setting device='cpu' and use float16 with CPU if memory is a concern.
Use CPU fallback by setting device='cpu' and use float16 with CPU if memory is a concern.
中文步骤
Load model with torch_dtype=torch.float32 and use model.to('mps') without half precision.Set environment variable PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable MPS memory allocation optimization.
Use CPU fallback by setting device='cpu' and use float16 with CPU if memory is a concern.
Dead Ends
Common approaches that don't work:
-
95% fail
MPS backend has limited FP16 support; explicit FP16 casting triggers the error.
-
90% fail
Half-precision conversion on MPS is not fully implemented and causes memory allocation errors.