# Startup probe failed: HTTP probe failed with statuscode: 503

- **ID:** `cloud/gcp-cloud-run-container-startup-probe-failure`
- **Domain:** cloud
- **Category:** runtime_error
- **Error Code:** `503`
- **Verification:** ai_generated
- **Fix Rate:** 90%

## Root Cause

Cloud Run's startup probe endpoint is returning a non-2xx status (503) within the initial startup period, often because the application takes longer to initialize than the probe's initialDelaySeconds or periodSeconds allows.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| Cloud Run (managed) 2024 | active | — | — |
| Knative Serving 1.12 | active | — | — |
| gcloud CLI 462 | active | — | — |

## Workarounds

1. **Increase startup probe initialDelaySeconds to match application startup time: `gcloud run services update my-service --startup-probe-initial-delay=60 --startup-probe-period=10 --startup-probe-failure-threshold=6`. Also ensure the /health endpoint returns 200 only after full initialization.** (90% success)
   ```
   Increase startup probe initialDelaySeconds to match application startup time: `gcloud run services update my-service --startup-probe-initial-delay=60 --startup-probe-period=10 --startup-probe-failure-threshold=6`. Also ensure the /health endpoint returns 200 only after full initialization.
   ```
2. **Implement a health check endpoint that returns 503 until the application is ready, then 200. Example in Python Flask: `@app.route('/health') def health(): return ('OK', 200) if app_ready else ('Service Unavailable', 503)`. Set app_ready = True after initialization completes.** (95% success)
   ```
   Implement a health check endpoint that returns 503 until the application is ready, then 200. Example in Python Flask: `@app.route('/health') def health(): return ('OK', 200) if app_ready else ('Service Unavailable', 503)`. Set app_ready = True after initialization completes.
   ```

## Dead Ends

- **** — Resource increase may not reduce startup time if the application has a fixed initialization delay (e.g., loading ML models). (60% fail)
- **** — Cloud Run requires a startup probe for long-running services; removing it may cause the container to be killed before it finishes initializing. (80% fail)
- **** — Too short a period causes rapid retries that may overwhelm the application during startup, worsening the issue. (75% fail)
