# InternalError: TF_DATA cache file '/tmp/tf_data_cache_abc123' is corrupted: expected header size 1024 but got 512

- **ID:** `tensorflow/tf-data-cache-corruption`
- **Domain:** tensorflow
- **Category:** data_error
- **Error Code:** `TDC`
- **Verification:** ai_generated
- **Fix Rate:** 95%

## Root Cause

The tf.data service cache file was partially written due to an abrupt process termination or disk full condition, causing a mismatch in the expected header size.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| tensorflow>=2.15.0 | active | — | — |
| python>=3.10 | active | — | — |

## Workarounds

1. **Delete the corrupted cache file manually: `rm /tmp/tf_data_cache_abc123` (or the path in the error), then re-run the pipeline. The cache will be regenerated.** (95% success)
   ```
   Delete the corrupted cache file manually: `rm /tmp/tf_data_cache_abc123` (or the path in the error), then re-run the pipeline. The cache will be regenerated.
   ```
2. **Disable caching for the dataset by removing the `.cache()` call or setting `cache=''` in the dataset creation, and rely on in-memory caching instead.** (90% success)
   ```
   Disable caching for the dataset by removing the `.cache()` call or setting `cache=''` in the dataset creation, and rely on in-memory caching instead.
   ```

## Dead Ends

- **Increasing the size of the cache by setting tf.data.experimental.service.CACHE_MAX_SIZE.** — The error is about file corruption, not capacity; a larger cache does not fix a corrupted file header. (80% fail)
- **Reinstalling TensorFlow to fix the cache mechanism.** — The corruption is specific to the cache file on disk, not the TensorFlow installation; reinstalling does not remove the corrupted file. (95% fail)
