NVRTC_ERROR_COMPILATION (6) cuda build_error ai_generated true

RuntimeError: NVRTC 编译失败:错误:PTX 汇编需要 .target sm_52 或更高版本。当前目标:sm_50

RuntimeError: NVRTC compilation failed: error: Ptx assembly requires .target sm_52 or higher. Current target: sm_50

ID: cuda/nvrtc-ptx-arch-mismatch

其他格式: JSON · Markdown 中文 · English
92%修复率
88%置信度
1证据数
2023-03-10首次发现

版本兼容性

版本状态引入弃用备注
CUDA 11.7 active
CUDA 12.0 active
NVRTC 11.7 active
NVRTC 12.0 active

根因分析

PTX 汇编(内联或通过 nvrtc)需要最低计算能力 5.2(Maxwell)才能支持某些指令;针对 sm_50(Maxwell 5.0)缺少统一内存寻址和原生原子操作等特性,这些是 PTX 所需的。

English

PTX assembly (inline or via nvrtc) requires a minimum compute capability of 5.2 (Maxwell) to support certain instructions; targeting sm_50 (Maxwell 5.0) lacks features like unified memory addressing and native atomics needed for PTX.

generic

官方文档

https://docs.nvidia.com/cuda/nvrtc/index.html#nvrtc-compilation

解决方案

  1. Update the target architecture to sm_52 or higher in the NVRTC compilation options. Example: pass '-arch=sm_52' or set the environment variable CUDAARCHS to include sm_52.
  2. If the GPU actually supports sm_52 (e.g., Tesla M40 or newer), ensure the CUDA toolkit version is >= 8.0 which added sm_52 support. If the GPU is sm_50 only (e.g., Tesla K80), replace PTX assembly with equivalent CUDA C code that compiles without PTX.

无效尝试

常见但无效的做法:

  1. 60% 失败

    Removing PTX instructions entirely may break the kernel functionality; the error only occurs if PTX is actually used.

  2. 95% 失败

    Setting a higher architecture like sm_86 on an older GPU (e.g., sm_50) causes a different error: 'no kernel image is available for execution on the device'.