cudaErrorMpsServerNotReady cuda system_error ai_generated true

CUDA error: MPS server is not running (cudaErrorMpsServerNotReady)

ID: cuda/cuda-mps-server-unavailable

Also available as: JSON · Markdown · 中文
88%Fix Rate
82%Confidence
1Evidence
2024-06-10First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
CUDA 12.4 active
NVIDIA Driver 550.54.14 active
Ubuntu 22.04 active

Root Cause

The CUDA Multi-Process Service (MPS) control daemon is not active, but the application attempted to connect to it via CUDA_MPS_PIPE_DIRECTORY or similar environment variables.

generic

中文

CUDA 多进程服务 (MPS) 控制守护进程未运行,但应用程序通过 CUDA_MPS_PIPE_DIRECTORY 或类似环境变量尝试连接到它。

Official Documentation

https://docs.nvidia.com/deploy/mps/index.html

Workarounds

  1. 90% success export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps; nvidia-cuda-mps-control -d
    export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps; nvidia-cuda-mps-control -d
  2. 95% success unset CUDA_MPS_PIPE_DIRECTORY; unset CUDA_MPS_LOG_DIRECTORY; python your_script.py
    unset CUDA_MPS_PIPE_DIRECTORY; unset CUDA_MPS_LOG_DIRECTORY; python your_script.py

中文步骤

  1. export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps; nvidia-cuda-mps-control -d
  2. unset CUDA_MPS_PIPE_DIRECTORY; unset CUDA_MPS_LOG_DIRECTORY; python your_script.py

Dead Ends

Common approaches that don't work:

  1. 100% fail

    The daemon must be started explicitly; restarting the app alone does not launch it.

  2. 95% fail

    The environment variable only points to the daemon's socket; if the daemon is not running, no path works.