CUDNN_STATUS_ARCH_MISMATCH
cuda
type_error
ai_generated
true
RuntimeError: Tensor Cores are not supported on the current device architecture (compute capability < 7.0)
ID: cuda/tensor-core-unsupported-arch
90%Fix Rate
86%Confidence
1Evidence
2024-01-20First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| CUDA 11.0 | active | — | — | — |
| CUDA 12.1 | active | — | — | — |
| CUDA 12.4 | active | — | — | — |
Root Cause
The GPU compute capability is below 7.0 (Volta), which is required for Tensor Core operations like mixed-precision training with float16 or bfloat16.
generic中文
GPU 计算能力低于 7.0(Volta),这是张量核心操作(如使用 float16 或 bfloat16 的混合精度训练)所必需的。
Official Documentation
https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnSetTensorNdDescriptorWorkarounds
-
90% success Disable Tensor Core usage by setting torch.backends.cuda.matmul.allow_tf32 = False and torch.backends.cudnn.allow_tf32 = False, and use float32 precision instead of float16. For example: model.half() should be replaced with model.float(); and in training, use torch.amp.autocast(device_type='cuda', enabled=False).
Disable Tensor Core usage by setting torch.backends.cuda.matmul.allow_tf32 = False and torch.backends.cudnn.allow_tf32 = False, and use float32 precision instead of float16. For example: model.half() should be replaced with model.float(); and in training, use torch.amp.autocast(device_type='cuda', enabled=False).
-
95% success If Tensor Cores are essential, migrate to a GPU with compute capability >= 7.0 (e.g., Tesla V100, RTX 20 series, or newer). Check your GPU's compute capability at https://developer.nvidia.com/cuda-gpus.
If Tensor Cores are essential, migrate to a GPU with compute capability >= 7.0 (e.g., Tesla V100, RTX 20 series, or newer). Check your GPU's compute capability at https://developer.nvidia.com/cuda-gpus.
中文步骤
Disable Tensor Core usage by setting torch.backends.cuda.matmul.allow_tf32 = False and torch.backends.cudnn.allow_tf32 = False, and use float32 precision instead of float16. For example: model.half() should be replaced with model.float(); and in training, use torch.amp.autocast(device_type='cuda', enabled=False).
If Tensor Cores are essential, migrate to a GPU with compute capability >= 7.0 (e.g., Tesla V100, RTX 20 series, or newer). Check your GPU's compute capability at https://developer.nvidia.com/cuda-gpus.
Dead Ends
Common approaches that don't work:
-
90% fail
Upgrading the CUDA toolkit does not add Tensor Core support to older GPU architectures.
-
80% fail
Setting environment variable CUDA_LAUNCH_BLOCKING=1 does not enable Tensor Cores; it only serializes kernel launches.