CUDNN_STATUS_BAD_PARAM cuda runtime_error ai_generated true

RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM when calling cudnnBatchNormalizationForwardTraining

ID: cuda/cudnn-batch-norm-bad-param

Also available as: JSON · Markdown · 中文
78%Fix Rate
83%Confidence
1Evidence
2024-01-18First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
cuDNN 8.9.5 active
CUDA 12.1 active
PyTorch 2.2.0 active
TensorFlow 2.15 active

Root Cause

An invalid parameter was passed to cuDNN batch normalization, such as a zero epsilon value, negative momentum, or mismatched tensor shapes between input, scale, and bias.

generic

中文

向 cuDNN 批归一化传递了无效参数,例如 epsilon 为零、动量为负值,或输入、缩放和偏置张量的形状不匹配。

Official Documentation

https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnBatchNormalizationForwardTraining

Workarounds

  1. 85% success import torch; bn = torch.nn.BatchNorm2d(64, eps=1e-5, momentum=0.1)
    import torch; bn = torch.nn.BatchNorm2d(64, eps=1e-5, momentum=0.1)
  2. 75% success torch.backends.cudnn.enabled = False
    torch.backends.cudnn.enabled = False

中文步骤

  1. import torch; bn = torch.nn.BatchNorm2d(64, eps=1e-5, momentum=0.1)
  2. torch.backends.cudnn.enabled = False

Dead Ends

Common approaches that don't work:

  1. 95% fail

    Batch size does not affect parameter validation; the bad parameter error occurs regardless of batch size.

  2. 90% fail

    Deterministic mode only affects algorithm selection, not parameter checking; the bad parameter is still detected.