# 内部错误：cuDNN执行失败：CUDNN_STATUS_EXECUTION_FAILED

- **ID:** `tensorflow/cudnn-status-execution-failed`
- **领域:** tensorflow
- **类别:** gpu_error
- **错误码:** `ECF`
- **验证级别:** ai_generated
- **修复率:** 75%

## 根因

cuDNN遇到执行失败，通常是由于不兼容的张量形状或损坏的GPU状态。

## 版本兼容性

| 版本 | 状态 | 引入 | 弃用 |
|------|------|------|------|
| tensorflow 2.10.0 | active | — | — |
| cudnn 8.4.1 | active | — | — |
| cuda 11.7 | active | — | — |

## 解决方案

1. ```
   Reduce batch size to avoid memory pressure: model.fit(..., batch_size=16)
   ```
2. ```
   Set TF_GPU_ALLOCATOR=cuda_malloc_async to use async allocator: export TF_GPU_ALLOCATOR=cuda_malloc_async
   ```
3. ```
   Clear GPU memory and reset: tf.keras.backend.clear_session()
   ```

## 无效尝试

- **** — Increases batch size thinking more data helps, but often makes shape mismatch worse. (60% 失败率)
- **** — Restarting kernel may fix transient state but doesn't address underlying shape issue. (30% 失败率)
