# RuntimeError: cuDNN error: CUDNN_STATUS_BAD_PARAM when calling cudnnSetRNNDescriptor_v8

- **ID:** `cuda/cudnn-rnn-hidden-size-mismatch`
- **Domain:** cuda
- **Category:** runtime_error
- **Error Code:** `CUDNN_STATUS_BAD_PARAM`
- **Verification:** ai_generated
- **Fix Rate:** 85%

## Root Cause

The hidden size provided to an RNN/LSTM/GRU layer is not a multiple of 32 or 64 (depending on cuDNN version and RNN mode), violating cuDNN's alignment requirement for performance kernels, or the number of layers is zero.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| cuDNN 8.9.0 | active | — | — |
| cuDNN 8.9.5 | active | — | — |
| PyTorch 2.1.0 | active | — | — |
| TensorFlow 2.14 | active | — | — |

## Workarounds

1. **Set the hidden size to a multiple of 64 (or 32 for some cuDNN versions). For example, if hidden_size=100, change to 128. In PyTorch: `nn.LSTM(input_size, hidden_size=128, num_layers=2)`. Verify by checking `hidden_size % 64 == 0`.** (90% success)
   ```
   Set the hidden size to a multiple of 64 (or 32 for some cuDNN versions). For example, if hidden_size=100, change to 128. In PyTorch: `nn.LSTM(input_size, hidden_size=128, num_layers=2)`. Verify by checking `hidden_size % 64 == 0`.
   ```
2. **If you must keep an arbitrary hidden size, use `torch.backends.cudnn.rnn.allow_tf32 = False` and set `torch.backends.cudnn.deterministic = True` to force a fallback implementation that may not enforce alignment (performance penalty).** (70% success)
   ```
   If you must keep an arbitrary hidden size, use `torch.backends.cudnn.rnn.allow_tf32 = False` and set `torch.backends.cudnn.deterministic = True` to force a fallback implementation that may not enforce alignment (performance penalty).
   ```
3. **Explicitly pad the hidden state tensor to the next multiple of 64 using `torch.nn.functional.pad` before passing to the RNN, then slice the output back to the original size.** (80% success)
   ```
   Explicitly pad the hidden state tensor to the next multiple of 64 using `torch.nn.functional.pad` before passing to the RNN, then slice the output back to the original size.
   ```

## Dead Ends

- **Setting `torch.backends.cudnn.enabled = False` to disable cuDNN** — Disabling cuDNN may fall back to a non-cuDNN RNN implementation that still validates hidden size; also significantly degrades performance. (70% fail)
- **Reducing the number of RNN layers arbitrarily** — The error is about hidden size alignment, not layer count; reducing layers only helps if num_layers was zero, which is rare. (90% fail)
- **Switching to a different RNN cell type (e.g., LSTM to GRU) without changing hidden size** — The alignment requirement applies to all cuDNN RNN cells; the error persists if hidden size is not a multiple of the alignment. (85% fail)
