CUBLAS_STATUS_ALLOC_FAILED cuda runtime_error ai_generated partial

CUDA 错误：调用 cublasCreate_v2 时 CUBLAS_STATUS_ALLOC_FAILED

CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate_v2

ID: cuda/cublas-api-error-on-shutdown

其他格式: JSON · Markdown 中文 · English

75%修复率

85%置信度

1证据数

2023-03-15首次发现

版本兼容性

版本	状态	引入	弃用	备注
CUDA 11.8	active	—	—	—
CUDA 12.1	active	—	—	—
cuBLAS 11.11	active	—	—	—
cuBLAS 12.0	active	—	—	—

根因分析

cuBLAS 句柄分配失败，通常是由于 GPU 内存不足或驱动程序状态损坏，在快速创建/销毁上下文或在之前的 CUDA 错误使设备处于不一致状态后触发。

English

cuBLAS handle allocation fails due to insufficient GPU memory or driver state corruption, often triggered during rapid context creation/destruction or after a previous CUDA error left the device in an inconsistent state.

generic

官方文档

https://docs.nvidia.com/cuda/cublas/index.html#cublascreate

解决方案

Reset the CUDA device by calling `torch.cuda.reset_peak_memory_stats()` and `torch.cuda.empty_cache()` before creating new cuBLAS handles. Then reinitialize the model in a fresh context.

Kill all processes using the GPU with `nvidia-smi` and restart the application. For persistent issues, reboot the machine to fully reset the GPU driver state.

无效尝试

常见但无效的做法:

80% 失败
The previous CUDA context may still be alive, and residual allocations prevent new handle creation; a full GPU reset or process kill is needed.
90% 失败
The error is not about insufficient memory for tensors but about handle allocation; larger batch sizes exacerbate memory pressure.
70% 失败
The issue is often runtime state corruption, not a missing library; driver version mismatch can cause other errors, but this specific error persists.