# AWS Lambda SQS 触发器：未处理部分批次失败，处理失败后所有消息再次变为可见

- **ID:** `cloud/aws-lambda-sqs-batch-partial-failure`
- **领域:** cloud
- **类别:** config_error
- **错误码:** `Lambda.SQS.PartialBatchFailure`
- **验证级别:** ai_generated
- **修复率:** 90%

## 根因

当使用 SQS 作为 Lambda 触发器进行批处理时，如果 Lambda 函数未能处理某些消息但未使用 'ReportBatchItemFailures' 报告特定失败，则整个批次将被重试，导致成功处理的消息被重复处理。

## 版本兼容性

| 版本 | 状态 | 引入 | 弃用 |
|------|------|------|------|
| AWS Lambda: runtime >= Node.js 18.x | active | — | — |
| AWS SDK: >= 3.300.0 | active | — | — |
| SQS: standard queue | active | — | — |

## 解决方案

1. ```
   Implement 'ReportBatchItemFailures' in the Lambda function response. Example in Node.js: return { batchItemFailures: [ { itemIdentifier: failedMessage.messageId } ] }. Configure the event source mapping with 'FunctionResponseTypes: ["ReportBatchItemFailures"]'.
   ```
2. ```
   Use a dead-letter queue (DLQ) on the SQS source to capture failed messages after max retries, and process them separately.
   ```

## 无效尝试

- **** — Reduces throughput significantly; doesn't solve the root cause of failure reporting, and if any message fails, the single message is still retried indefinitely. (70% 失败率)
- **** — Silently swallows errors, leading to data loss and no visibility into processing failures. (90% 失败率)
- **** — Doesn't address the partial failure reporting; successful messages may still be reprocessed after the timeout expires. (50% 失败率)
