policy resource_error ai_generated true

Cloud Run service deployment fails with 'Quota exceeded for resource: 'projects/my-project/regions/us-central1/quotas/max-instances'. Limit: 100, Usage: 100'

ID: policy/cloud-run-max-instances-quota-exceeded

Also available as: JSON · Markdown · 中文
85%Fix Rate
85%Confidence
1Evidence
2024-03-15First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
Cloud Run API v1 active
gcloud CLI 400.0.0+ active
Google Cloud Console as of 2024 active

Root Cause

The Cloud Run service's max-instances setting (or the sum of all services' max-instances in the region) exceeds the regional quota for max instances, which is typically 100 by default.

generic

中文

Cloud Run 服务的 max-instances 设置(或该区域所有服务的 max-instances 总和)超过了该区域的最大实例配额,默认通常为 100。

Official Documentation

https://cloud.google.com/run/quotas

Workarounds

  1. 90% success Identify all Cloud Run services in the region using `gcloud run services list --region=us-central1`, sum their max-instances values, and reduce the max-instances on one or more services to bring the total under the quota (e.g., 100). Then redeploy.
    Identify all Cloud Run services in the region using `gcloud run services list --region=us-central1`, sum their max-instances values, and reduce the max-instances on one or more services to bring the total under the quota (e.g., 100). Then redeploy.
  2. 70% success Request a quota increase via the Google Cloud Console: go to IAM & Admin > Quotas, find 'Max instances per region' for Cloud Run, and request an increase (e.g., to 200). This may take 1-2 business days.
    Request a quota increase via the Google Cloud Console: go to IAM & Admin > Quotas, find 'Max instances per region' for Cloud Run, and request an increase (e.g., to 200). This may take 1-2 business days.
  3. 30% success Set max-instances to 0 (which means no limit) only if the service is CPU-throttled and can handle cold starts, but note this bypasses the quota check only if the total usage is under the limit. If the total is already at the limit, this will still fail.
    Set max-instances to 0 (which means no limit) only if the service is CPU-throttled and can handle cold starts, but note this bypasses the quota check only if the total usage is under the limit. If the total is already at the limit, this will still fail.

中文步骤

  1. 使用 `gcloud run services list --region=us-central1` 识别该区域的所有 Cloud Run 服务,求和它们的 max-instances 值,然后减少一个或多个服务的 max-instances 以使总和低于配额(例如 100),然后重新部署。
  2. 通过 Google Cloud 控制台请求配额增加:转到 IAM 和管理 > 配额,找到 Cloud Run 的‘每个区域的最大实例数’,请求增加(例如增加到 200)。这可能需要 1-2 个工作日。
  3. 仅当服务是 CPU 节流且可以处理冷启动时,才将 max-instances 设置为 0(这意味着无限制),但请注意,这仅在总使用量低于配额时才绕过配额检查。如果总使用量已达到限制,这仍然会失败。

Dead Ends

Common approaches that don't work:

  1. 60% fail

    The quota error is about the total sum of max-instances across all services in the region, not just one service. Reducing one service may not bring the total under the limit if other services have high values.

  2. 80% fail

    The quota is persistent and region-level. Deleting and recreating does not change the usage count until the service is fully removed, and the new deployment will still fail if the quota is exceeded.

  3. 90% fail

    This will immediately exceed the quota even more, causing the same error. The quota limit is enforced at deployment time.