# HTTP 503 Service Unavailable: The request failed because the service is scaling up. Try again later.

- **ID:** `cloud/gcp-cloud-run-cold-start-http-503`
- **Domain:** cloud
- **Category:** runtime_error
- **Error Code:** `HTTP 503`
- **Verification:** ai_generated
- **Fix Rate:** 80%

## Root Cause

Cloud Run's cold start latency (due to container image pull and startup) exceeds the request timeout, causing the load balancer to return 503 before the container is ready.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| Cloud Run (fully managed) gen2 | active | — | — |
| Cloud Run for Anthos 1.28 | active | — | — |

## Workarounds

1. **Set min instances to at least 1 to keep a warm instance always ready: gcloud run deploy SERVICE --min-instances 1. For production, use 2-3 to handle traffic spikes.** (85% success)
   ```
   Set min instances to at least 1 to keep a warm instance always ready: gcloud run deploy SERVICE --min-instances 1. For production, use 2-3 to handle traffic spikes.
   ```
2. **Optimize container startup: use distroless base images, reduce image size, and move initialization to a background thread. Example Dockerfile: FROM gcr.io/distroless/java17-debian11, then use Spring Boot's lazy initialization.** (80% success)
   ```
   Optimize container startup: use distroless base images, reduce image size, and move initialization to a background thread. Example Dockerfile: FROM gcr.io/distroless/java17-debian11, then use Spring Boot's lazy initialization.
   ```
3. **Enable 'startup CPU boost' to allocate additional CPU during container startup: gcloud run deploy SERVICE --cpu-boost** (75% success)
   ```
   Enable 'startup CPU boost' to allocate additional CPU during container startup: gcloud run deploy SERVICE --cpu-boost
   ```

## Dead Ends

- **** — Simply retrying the request without addressing cold start may succeed eventually but adds latency and costs. (70% fail)
- **** — Increasing max instances doesn't reduce cold start frequency; it only limits concurrency. (95% fail)
- **** — Setting min instances to 1 reduces cold start for the first instance but doesn't help if all instances are busy. (60% fail)
