cudaErrorStreamCaptureInvalidated cuda runtime_error ai_generated true

运行时错误:CUDA错误:流正在捕获时不允许操作(streamCaptureInvalidated)

RuntimeError: CUDA error: operation not permitted when stream is capturing (streamCaptureInvalidated)

ID: cuda/stream-capture-invalid-scope

其他格式: JSON · Markdown 中文 · English
81%修复率
87%置信度
1证据数
2024-09-05首次发现

版本兼容性

版本状态引入弃用备注
CUDA 12.0 active
PyTorch 2.1.0 active
NVIDIA Driver 535.129.03 active

根因分析

流上正在进行CUDA图捕获,但尝试了捕获期间无效的操作(例如内存分配、主机端同步),导致捕获失效。

English

A CUDA graph capture is in progress on a stream, but an operation (e.g., memory allocation, host-side sync) that is invalid during capture was attempted, invalidating the capture.

generic

官方文档

https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__STREAM.html

解决方案

  1. Move all memory allocations and host-device synchronization outside the capture scope. Example: pre-allocate tensors before calling torch.cuda.CUDAGraph.begin_capture(), and use torch.cuda.synchronize() only after capture ends.
  2. Use cudaStreamBeginCapture with cudaStreamCaptureModeGlobal to allow more operations, but ensure no host-side blocking calls occur during capture. In PyTorch, wrap the capture in a context manager that defers any print or sleep calls.

无效尝试

常见但无效的做法:

  1. 92% 失败

    This disables cuDNN heuristics but does not fix the capture violation; the error will reoccur if capture is attempted again.

  2. 98% 失败

    Thread configuration is unrelated to capture validity; the error is about operations allowed during capture.