cudaErrorMpsServerNotReady cuda system_error ai_generated true

CUDA 错误:MPS 服务器未运行 (cudaErrorMpsServerNotReady)

CUDA error: MPS server is not running (cudaErrorMpsServerNotReady)

ID: cuda/cuda-mps-server-unavailable

其他格式: JSON · Markdown 中文 · English
88%修复率
82%置信度
1证据数
2024-06-10首次发现

版本兼容性

版本状态引入弃用备注
CUDA 12.4 active
NVIDIA Driver 550.54.14 active
Ubuntu 22.04 active

根因分析

CUDA 多进程服务 (MPS) 控制守护进程未运行,但应用程序通过 CUDA_MPS_PIPE_DIRECTORY 或类似环境变量尝试连接到它。

English

The CUDA Multi-Process Service (MPS) control daemon is not active, but the application attempted to connect to it via CUDA_MPS_PIPE_DIRECTORY or similar environment variables.

generic

官方文档

https://docs.nvidia.com/deploy/mps/index.html

解决方案

  1. export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps; nvidia-cuda-mps-control -d
  2. unset CUDA_MPS_PIPE_DIRECTORY; unset CUDA_MPS_LOG_DIRECTORY; python your_script.py

无效尝试

常见但无效的做法:

  1. 100% 失败

    The daemon must be started explicitly; restarting the app alone does not launch it.

  2. 95% 失败

    The environment variable only points to the daemon's socket; if the daemon is not running, no path works.