CUDNN_STATUS_BAD_PARAM cuda runtime_error ai_generated true

运行时错误:cuDNN 错误:调用 cudnnBatchNormalizationForwardTraining 时出现 CUDNN_STATUS_BAD_PARAM

RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM when calling cudnnBatchNormalizationForwardTraining

ID: cuda/cudnn-batch-norm-bad-param

其他格式: JSON · Markdown 中文 · English
78%修复率
83%置信度
1证据数
2024-01-18首次发现

版本兼容性

版本状态引入弃用备注
cuDNN 8.9.5 active
CUDA 12.1 active
PyTorch 2.2.0 active
TensorFlow 2.15 active

根因分析

向 cuDNN 批归一化传递了无效参数,例如 epsilon 为零、动量为负值,或输入、缩放和偏置张量的形状不匹配。

English

An invalid parameter was passed to cuDNN batch normalization, such as a zero epsilon value, negative momentum, or mismatched tensor shapes between input, scale, and bias.

generic

官方文档

https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnBatchNormalizationForwardTraining

解决方案

  1. import torch; bn = torch.nn.BatchNorm2d(64, eps=1e-5, momentum=0.1)
  2. torch.backends.cudnn.enabled = False

无效尝试

常见但无效的做法:

  1. 95% 失败

    Batch size does not affect parameter validation; the bad parameter error occurs regardless of batch size.

  2. 90% 失败

    Deterministic mode only affects algorithm selection, not parameter checking; the bad parameter is still detected.