CUDNN_STATUS_BAD_PARAM
cuda
runtime_error
ai_generated
true
RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM when calling cudnnRNNForwardTraining with input size mismatch
ID: cuda/cudnn-rnn-bad-dims
80%Fix Rate
84%Confidence
1Evidence
2023-11-05First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| cuDNN 8.9.2 | active | — | — | — |
| CUDA 12.0 | active | — | — | — |
| PyTorch 2.1.0 | active | — | — | — |
| TensorFlow 2.14 | active | — | — | — |
Root Cause
The input tensor to an RNN layer has a feature dimension that does not match the expected input size defined in the cuDNN RNN descriptor, or the batch size is inconsistent between layers.
generic中文
RNN 层的输入张量的特征维度与 cuDNN RNN 描述符中定义的预期输入大小不匹配,或者各层之间的批次大小不一致。
Official Documentation
https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnRNNForwardTrainingWorkarounds
-
90% success import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
-
85% success rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)
rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)
中文步骤
import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)
Dead Ends
Common approaches that don't work:
-
90% fail
Layer count does not affect the input dimension mismatch; the error is about the first layer's input size.
-
85% fail
The input dimension requirement is independent of cell type; the mismatch persists.