NVRTC_ERROR_COMPILATION (6)
cuda
build_error
ai_generated
true
RuntimeError: NVRTC compilation failed: error: Ptx assembly requires .target sm_52 or higher. Current target: sm_50
ID: cuda/nvrtc-ptx-arch-mismatch
92%Fix Rate
88%Confidence
1Evidence
2023-03-10First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| CUDA 11.7 | active | — | — | — |
| CUDA 12.0 | active | — | — | — |
| NVRTC 11.7 | active | — | — | — |
| NVRTC 12.0 | active | — | — | — |
Root Cause
PTX assembly (inline or via nvrtc) requires a minimum compute capability of 5.2 (Maxwell) to support certain instructions; targeting sm_50 (Maxwell 5.0) lacks features like unified memory addressing and native atomics needed for PTX.
generic中文
PTX 汇编(内联或通过 nvrtc)需要最低计算能力 5.2(Maxwell)才能支持某些指令;针对 sm_50(Maxwell 5.0)缺少统一内存寻址和原生原子操作等特性,这些是 PTX 所需的。
Official Documentation
https://docs.nvidia.com/cuda/nvrtc/index.html#nvrtc-compilationWorkarounds
-
95% success Update the target architecture to sm_52 or higher in the NVRTC compilation options. Example: pass '-arch=sm_52' or set the environment variable CUDAARCHS to include sm_52.
Update the target architecture to sm_52 or higher in the NVRTC compilation options. Example: pass '-arch=sm_52' or set the environment variable CUDAARCHS to include sm_52.
-
85% success If the GPU actually supports sm_52 (e.g., Tesla M40 or newer), ensure the CUDA toolkit version is >= 8.0 which added sm_52 support. If the GPU is sm_50 only (e.g., Tesla K80), replace PTX assembly with equivalent CUDA C code that compiles without PTX.
If the GPU actually supports sm_52 (e.g., Tesla M40 or newer), ensure the CUDA toolkit version is >= 8.0 which added sm_52 support. If the GPU is sm_50 only (e.g., Tesla K80), replace PTX assembly with equivalent CUDA C code that compiles without PTX.
中文步骤
Update the target architecture to sm_52 or higher in the NVRTC compilation options. Example: pass '-arch=sm_52' or set the environment variable CUDAARCHS to include sm_52.
If the GPU actually supports sm_52 (e.g., Tesla M40 or newer), ensure the CUDA toolkit version is >= 8.0 which added sm_52 support. If the GPU is sm_50 only (e.g., Tesla K80), replace PTX assembly with equivalent CUDA C code that compiles without PTX.
Dead Ends
Common approaches that don't work:
-
60% fail
Removing PTX instructions entirely may break the kernel functionality; the error only occurs if PTX is actually used.
-
95% fail
Setting a higher architecture like sm_86 on an older GPU (e.g., sm_50) causes a different error: 'no kernel image is available for execution on the device'.