CUDNN_STATUS_BAD_PARAM cuda runtime_error ai_generated true

RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM when calling cudnnRNNForwardTraining with input size mismatch

ID: cuda/cudnn-rnn-bad-dims

Also available as: JSON · Markdown · 中文
80%Fix Rate
84%Confidence
1Evidence
2023-11-05First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
cuDNN 8.9.2 active
CUDA 12.0 active
PyTorch 2.1.0 active
TensorFlow 2.14 active

Root Cause

The input tensor to an RNN layer has a feature dimension that does not match the expected input size defined in the cuDNN RNN descriptor, or the batch size is inconsistent between layers.

generic

中文

RNN 层的输入张量的特征维度与 cuDNN RNN 描述符中定义的预期输入大小不匹配,或者各层之间的批次大小不一致。

Official Documentation

https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnRNNForwardTraining

Workarounds

  1. 90% success import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
    import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
  2. 85% success rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)
    rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)

中文步骤

  1. import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
  2. rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)

Dead Ends

Common approaches that don't work:

  1. 90% fail

    Layer count does not affect the input dimension mismatch; the error is about the first layer's input size.

  2. 85% fail

    The input dimension requirement is independent of cell type; the mismatch persists.