# RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED when calling cudnnRNNBackwardData_v8 with training mode enabled and double backward

- **ID:** `cuda/cudnn-rnn-double-backward`
- **Domain:** cuda
- **Category:** runtime_error
- **Error Code:** `CUDNN_STATUS_NOT_SUPPORTED (5)`
- **Verification:** ai_generated
- **Fix Rate:** 78%

## Root Cause

cuDNN RNN backward operations (especially backward data with double backward) are not supported for certain RNN modes (e.g., LSTM with projection) or when the input tensor requires grad and the graph is retained; cuDNN v8 restricts double backward support to specific configurations.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| cuDNN 8.9.0 | active | — | — |
| cuDNN 8.9.5 | active | — | — |
| PyTorch 2.1.0 | active | — | — |
| PyTorch 2.2.0 | active | — | — |

## Workarounds

1. **Switch to a non-projected LSTM (e.g., remove projection layer) or use GRU instead, which has broader double backward support. Example: change nn.LSTM(input_size, hidden_size, proj_size=hidden_size) to nn.LSTM(input_size, hidden_size).** (85% success)
   ```
   Switch to a non-projected LSTM (e.g., remove projection layer) or use GRU instead, which has broader double backward support. Example: change nn.LSTM(input_size, hidden_size, proj_size=hidden_size) to nn.LSTM(input_size, hidden_size).
   ```
2. **Use torch.autograd.grad with create_graph=False for the backward pass, and manually implement double backward using torch.autograd.Function with a custom backward that does not rely on cuDNN RNN backward data.** (75% success)
   ```
   Use torch.autograd.grad with create_graph=False for the backward pass, and manually implement double backward using torch.autograd.Function with a custom backward that does not rely on cuDNN RNN backward data.
   ```

## Dead Ends

- **** — Increasing cuDNN version does not add double backward support for all RNN modes; the limitation is architectural in cuDNN v8. (80% fail)
- **** — Setting torch.backends.cudnn.enabled=False forces a fallback to non-cuDNN RNN but may cause performance regression or different numerical behavior; double backward still fails if the custom RNN does not support it. (70% fail)
- **** — Using retain_graph=True without detaching intermediate activations does not prevent the error; the double backward path still triggers the unsupported cuDNN routine. (90% fail)
