cicd system_error ai_generated true

Error: The self-hosted runner encountered a disk space issue: No space left on device

ID: cicd/github-actions-self-hosted-runner-disk-full

Also available as: JSON · Markdown · 中文
88%Fix Rate
85%Confidence
1Evidence
2024-01-15First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
GitHub Actions Runner v2.311.0 active
Ubuntu 22.04 LTS active

Root Cause

Self-hosted GitHub Actions runner has exhausted disk space due to accumulated build artifacts, cached Docker images, or log files

generic

中文

自托管 GitHub Actions 运行器因累积的构建产物、缓存的 Docker 镜像或日志文件而耗尽磁盘空间

Official Documentation

https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners

Workarounds

  1. 90% success Add a cleanup cron job on the runner: '0 0 * * * docker system prune -af --volumes && rm -rf /home/runner/_work/* && df -h' to remove unused Docker images and old workspace data nightly
    Add a cleanup cron job on the runner: '0 0 * * * docker system prune -af --volumes && rm -rf /home/runner/_work/* && df -h' to remove unused Docker images and old workspace data nightly
  2. 85% success Configure a pre-job step in workflows to check disk space and fail early with a message: 'df -h / | tail -1 | awk "{print \$5}" | sed "s/%//" | xargs -I {} sh -c "if [ {} -gt 85 ]; then echo \"Disk usage {}% exceeds 85%\"; exit 1; fi"'
    Configure a pre-job step in workflows to check disk space and fail early with a message: 'df -h / | tail -1 | awk "{print \$5}" | sed "s/%//" | xargs -I {} sh -c "if [ {} -gt 85 ]; then echo \"Disk usage {}% exceeds 85%\"; exit 1; fi"'
  3. 95% success Use ephemeral runners with auto-scaling that are destroyed after each job to prevent disk accumulation
    Use ephemeral runners with auto-scaling that are destroyed after each job to prevent disk accumulation

中文步骤

  1. Add a cleanup cron job on the runner: '0 0 * * * docker system prune -af --volumes && rm -rf /home/runner/_work/* && df -h' to remove unused Docker images and old workspace data nightly
  2. Configure a pre-job step in workflows to check disk space and fail early with a message: 'df -h / | tail -1 | awk "{print \$5}" | sed "s/%//" | xargs -I {} sh -c "if [ {} -gt 85 ]; then echo \"Disk usage {}% exceeds 85%\"; exit 1; fi"'
  3. Use ephemeral runners with auto-scaling that are destroyed after each job to prevent disk accumulation

Dead Ends

Common approaches that don't work:

  1. 70% fail

    Increasing runner disk size via cloud provider without cleaning existing files only delays the issue and increases cost

  2. 60% fail

    Manually deleting random files in /tmp may break running workflows or cause permission errors

  3. 75% fail

    Adding more runners without addressing disk cleanup multiplies the problem across the fleet