# CUDA 错误：调用 cublasCreate_v2 时 CUBLAS_STATUS_ALLOC_FAILED

- **ID:** `cuda/cublas-api-error-on-shutdown`
- **领域:** cuda
- **类别:** runtime_error
- **错误码:** `CUBLAS_STATUS_ALLOC_FAILED`
- **验证级别:** ai_generated
- **修复率:** 75%

## 根因

cuBLAS 句柄分配失败，通常是由于 GPU 内存不足或驱动程序状态损坏，在快速创建/销毁上下文或在之前的 CUDA 错误使设备处于不一致状态后触发。

## 版本兼容性

| 版本 | 状态 | 引入 | 弃用 |
|------|------|------|------|
| CUDA 11.8 | active | — | — |
| CUDA 12.1 | active | — | — |
| cuBLAS 11.11 | active | — | — |
| cuBLAS 12.0 | active | — | — |

## 解决方案

1. ```
   Reset the CUDA device by calling `torch.cuda.reset_peak_memory_stats()` and `torch.cuda.empty_cache()` before creating new cuBLAS handles. Then reinitialize the model in a fresh context.
   ```
2. ```
   Kill all processes using the GPU with `nvidia-smi` and restart the application. For persistent issues, reboot the machine to fully reset the GPU driver state.
   ```

## 无效尝试

- **** — The previous CUDA context may still be alive, and residual allocations prevent new handle creation; a full GPU reset or process kill is needed. (80% 失败率)
- **** — The error is not about insufficient memory for tensors but about handle allocation; larger batch sizes exacerbate memory pressure. (90% 失败率)
- **** — The issue is often runtime state corruption, not a missing library; driver version mismatch can cause other errors, but this specific error persists. (70% 失败率)