CUDA_ERROR_INVALID_DEVICE_FUNCTION pytorch runtime_error ai_generated true

RuntimeError: CUDA error: invalid device function

ID: pytorch/cuda-error-invalid-device-function

Also available as: JSON · Markdown · 中文
80%Fix Rate
85%Confidence
1Evidence
2023-09-15First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
torch>=1.9.0 active
CUDA 11.x active
CUDA 12.x active

Root Cause

Compiled CUDA code contains a kernel call to a function that does not exist on the current GPU architecture, typically due to mismatched compute capabilities between compilation and runtime devices.

generic

中文

编译的CUDA代码中包含对当前GPU架构不存在的内核函数调用,通常是由于编译时与运行时的计算能力不匹配。

Official Documentation

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html

Workarounds

  1. 90% success export TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6" # Adjust to your GPU pip install torch --no-binary torch Or use a precompiled wheel for your architecture.
    export TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6"  # Adjust to your GPU
    pip install torch --no-binary torch
    Or use a precompiled wheel for your architecture.
  2. 70% success import os os.environ['CUDA_VISIBLE_DEVICES'] = '0' # Ensure only one GPU is visible # Then run your code
    import os
    os.environ['CUDA_VISIBLE_DEVICES'] = '0'  # Ensure only one GPU is visible
    # Then run your code
  3. 65% success torch.cuda.set_device(0) # Ensure your model and data are on the same device
    torch.cuda.set_device(0)
    # Ensure your model and data are on the same device

中文步骤

  1. export TORCH_CUDA_ARCH_LIST="7.5;8.0;8.6"  # Adjust to your GPU
    pip install torch --no-binary torch
    Or use a precompiled wheel for your architecture.
  2. import os
    os.environ['CUDA_VISIBLE_DEVICES'] = '0'  # Ensure only one GPU is visible
    # Then run your code
  3. torch.cuda.set_device(0)
    # Ensure your model and data are on the same device

Dead Ends

Common approaches that don't work:

  1. Reinstalling PyTorch with the same CUDA version 60% fail

    The issue is not the CUDA toolkit version but the PTX/JIT compilation target architecture. Reinstalling without specifying TORCH_CUDA_ARCH_LIST does not change the compiled kernels.

  2. Setting torch.backends.cudnn.enabled = False 80% fail

    This disables cuDNN but does not affect the CUDA kernel dispatch that triggers the invalid function error. The error originates from a different layer.

  3. Upgrading to the latest PyTorch version 50% fail

    The error is architecture-specific; a newer PyTorch may still compile kernels for the same set of architectures. The root cause is the GPU not supporting the compiled kernel.