cudaErrorStreamCaptureInvalidated
cuda
runtime_error
ai_generated
true
运行时错误:CUDA错误:流正在捕获时不允许操作(streamCaptureInvalidated)
RuntimeError: CUDA error: operation not permitted when stream is capturing (streamCaptureInvalidated)
ID: cuda/stream-capture-invalid-scope
81%修复率
87%置信度
1证据数
2024-09-05首次发现
版本兼容性
| 版本 | 状态 | 引入 | 弃用 | 备注 |
|---|---|---|---|---|
| CUDA 12.0 | active | — | — | — |
| PyTorch 2.1.0 | active | — | — | — |
| NVIDIA Driver 535.129.03 | active | — | — | — |
根因分析
流上正在进行CUDA图捕获,但尝试了捕获期间无效的操作(例如内存分配、主机端同步),导致捕获失效。
English
A CUDA graph capture is in progress on a stream, but an operation (e.g., memory allocation, host-side sync) that is invalid during capture was attempted, invalidating the capture.
官方文档
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html解决方案
-
Move all memory allocations and host-device synchronization outside the capture scope. Example: pre-allocate tensors before calling torch.cuda.CUDAGraph.begin_capture(), and use torch.cuda.synchronize() only after capture ends.
-
Use cudaStreamBeginCapture with cudaStreamCaptureModeGlobal to allow more operations, but ensure no host-side blocking calls occur during capture. In PyTorch, wrap the capture in a context manager that defers any print or sleep calls.
无效尝试
常见但无效的做法:
-
92% 失败
This disables cuDNN heuristics but does not fix the capture violation; the error will reoccur if capture is attempted again.
-
98% 失败
Thread configuration is unrelated to capture validity; the error is about operations allowed during capture.