运行时错误:CUDA错误:无效的设备函数
RuntimeError: CUDA error: invalid device function
ID: pytorch/cuda-error-invalid-device-function
版本兼容性
| 版本 | 状态 | 引入 | 弃用 | 备注 |
|---|---|---|---|---|
| torch>=1.9.0 | active | — | — | — |
| CUDA 11.x | active | — | — | — |
| CUDA 12.x | active | — | — | — |
根因分析
编译的CUDA代码中包含对当前GPU架构不存在的内核函数调用,通常是由于编译时与运行时的计算能力不匹配。
English
Compiled CUDA code contains a kernel call to a function that does not exist on the current GPU architecture, typically due to mismatched compute capabilities between compilation and runtime devices.
官方文档
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html解决方案
-
export TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6" # Adjust to your GPU pip install torch --no-binary torch Or use a precompiled wheel for your architecture.
-
import os os.environ['CUDA_VISIBLE_DEVICES'] = '0' # Ensure only one GPU is visible # Then run your code
-
torch.cuda.set_device(0) # Ensure your model and data are on the same device
无效尝试
常见但无效的做法:
-
Reinstalling PyTorch with the same CUDA version
60% 失败
The issue is not the CUDA toolkit version but the PTX/JIT compilation target architecture. Reinstalling without specifying TORCH_CUDA_ARCH_LIST does not change the compiled kernels.
-
Setting torch.backends.cudnn.enabled = False
80% 失败
This disables cuDNN but does not affect the CUDA kernel dispatch that triggers the invalid function error. The error originates from a different layer.
-
Upgrading to the latest PyTorch version
50% 失败
The error is architecture-specific; a newer PyTorch may still compile kernels for the same set of architectures. The root cause is the GPU not supporting the compiled kernel.