CUDNN_STATUS_BAD_PARAM cuda runtime_error ai_generated true

RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM when calling cudnnRNNForwardTraining with input size mismatch

ID: cuda/cudnn-rnn-bad-dims

Also available as: JSON · Markdown · 中文

80%Fix Rate

84%Confidence

1Evidence

2023-11-05First Seen

Version Compatibility

Version	Status	Introduced	Deprecated	Notes
cuDNN 8.9.2	active	—	—	—
CUDA 12.0	active	—	—	—
PyTorch 2.1.0	active	—	—	—
TensorFlow 2.14	active	—	—	—

Root Cause

The input tensor to an RNN layer has a feature dimension that does not match the expected input size defined in the cuDNN RNN descriptor, or the batch size is inconsistent between layers.

generic

中文

RNN 层的输入张量的特征维度与 cuDNN RNN 描述符中定义的预期输入大小不匹配，或者各层之间的批次大小不一致。

Official Documentation

https://docs.nvidia.com/deeplearning/cudnn/api/index.html#cudnnRNNForwardTraining

Workarounds

90% success import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
```
import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)
```
85% success rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)
```
rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)
```

中文步骤

import torch; rnn = torch.nn.LSTM(input_size=128, hidden_size=256, num_layers=2); input_tensor = torch.randn(10, 32, 128); output, _ = rnn(input_tensor)

rnn = torch.nn.LSTM(input_size=128, hidden_size=256, batch_first=True); input_tensor = torch.randn(32, 10, 128)

Dead Ends

Common approaches that don't work:

90% fail
Layer count does not affect the input dimension mismatch; the error is about the first layer's input size.
85% fail
The input dimension requirement is independent of cell type; the mismatch persists.