aws resource_error ai_generated true

Stopped reason: Task stopped due to OOM (Out of Memory) kill

ID: aws/ecs-task-stopped-oom-kill

Also available as: JSON · Markdown · 中文
78%Fix Rate
85%Confidence
1Evidence
2023-06-15First Seen

Version Compatibility

VersionStatusIntroducedDeprecatedNotes
AWS CLI 2.x active
ECS API 2014-11-13 active
Amazon ECS Agent 1.66.0 active

Root Cause

The ECS task's memory allocation was exceeded, causing the Linux kernel OOM killer to terminate the container.

generic

中文

ECS 任务的内存分配被超出,导致 Linux 内核 OOM killer 终止了容器。

Official Documentation

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-stopped-reason.html

Workarounds

  1. 80% success Increase the task memory hard limit in the ECS task definition, e.g., set 'memory' to 2048 MiB, and ensure 'memoryReservation' is at least 512 MiB. Then redeploy the service.
    Increase the task memory hard limit in the ECS task definition, e.g., set 'memory' to 2048 MiB, and ensure 'memoryReservation' is at least 512 MiB. Then redeploy the service.
  2. 85% success Add memory usage monitoring via CloudWatch Container Insights: enable in ECS cluster settings, then set up a CloudWatch alarm on 'MemoryUtilized' metric. If usage spikes, consider optimizing application code to reduce memory footprint.
    Add memory usage monitoring via CloudWatch Container Insights: enable in ECS cluster settings, then set up a CloudWatch alarm on 'MemoryUtilized' metric. If usage spikes, consider optimizing application code to reduce memory footprint.

中文步骤

  1. 在 ECS 任务定义中增加任务内存硬限制,例如将 'memory' 设置为 2048 MiB,并确保 'memoryReservation' 至少为 512 MiB,然后重新部署服务。
  2. 通过 CloudWatch Container Insights 添加内存使用监控:在 ECS 集群设置中启用,然后设置 CloudWatch 告警监控 'MemoryUtilized' 指标。如果使用量飙升,考虑优化应用程序代码以减少内存占用。

Dead Ends

Common approaches that don't work:

  1. 65% fail

    Increasing memory hard limit without adjusting soft limit or task memory reservation can still cause OOM if the container exceeds the hard limit.

  2. 50% fail

    Rebuilding the container with a different base image without monitoring memory usage doesn't address the root cause of memory leak or high consumption.