# TaskCancellationException: task [id:12345] cancelled with reason [timeout] while waiting for completion

- **ID:** `elasticsearch/task-cancellation-exception`
- **Domain:** elasticsearch
- **Category:** runtime_error
- **Error Code:** `TASK_CANCELLED`
- **Verification:** ai_generated
- **Fix Rate:** 80%

## Root Cause

A long-running task (e.g., reindex, snapshot) exceeded the configured timeout and was forcibly cancelled by the cluster.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| Elasticsearch 7.17 | active | — | — |
| Elasticsearch 8.5 | active | — | — |
| Elasticsearch 8.10 | active | — | — |

## Workarounds

1. **Increase the timeout for the specific task via API, e.g., `POST _tasks/cancel?actions=cluster:admin/reindex&timeout=2h`** (80% success)
   ```
   Increase the timeout for the specific task via API, e.g., `POST _tasks/cancel?actions=cluster:admin/reindex&timeout=2h`
   ```
2. **Optimize the task by reducing batch size: for reindex, set `{"source": {"size": 500}, "dest": {"index": "new_index"}}`** (85% success)
   ```
   Optimize the task by reducing batch size: for reindex, set `{"source": {"size": 500}, "dest": {"index": "new_index"}}`
   ```
3. **Increase `search.max_buckets` and `search.max_buckets_per_cluster` if the task involves heavy aggregation** (75% success)
   ```
   Increase `search.max_buckets` and `search.max_buckets_per_cluster` if the task involves heavy aggregation
   ```

## Dead Ends

- **Increasing task timeout in elasticsearch.yml (e.g., `task.timeout: 60m`) without analyzing actual task duration** — This may mask underlying performance issues (e.g., slow disk, insufficient memory) and cause cascading failures. (70% fail)
- **Restarting the cluster to clear all tasks** — Restarting drops all ongoing tasks, but the error will reoccur if the root cause (e.g., slow shard recovery) is not addressed. (80% fail)
- **Setting `task.timeout: 0` to disable timeout** — This can lead to indefinite task hangs and resource exhaustion, as the cluster will never cancel stuck tasks. (90% fail)
