# ValueError: Adding special tokens to a tokenizer that already has them; use `add_special_tokens=True` only if you intend to add new tokens. Got extra_ids=0 but tokenizer already has 2 special tokens.

- **ID:** `huggingface/tokenizer-extra-special-tokens-invalid`
- **Domain:** huggingface
- **Category:** config_error
- **Verification:** ai_generated
- **Fix Rate:** 90%

## Root Cause

User called `tokenizer.add_special_tokens()` with an empty or redundant special tokens dictionary, but the tokenizer already has those tokens defined, causing a validation error.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| tokenizers 0.19.1 | active | — | — |
| transformers 4.44.0 | active | — | — |

## Workarounds

1. **Check existing special tokens before adding: if `tokenizer.special_tokens_map` already contains the tokens, skip the `add_special_tokens` call entirely.** (90% success)
   ```
   Check existing special tokens before adding: if `tokenizer.special_tokens_map` already contains the tokens, skip the `add_special_tokens` call entirely.
   ```
2. **Use `tokenizer.add_special_tokens({'additional_special_tokens': ['<new_token>']})` only for truly new tokens, not duplicates.** (85% success)
   ```
   Use `tokenizer.add_special_tokens({'additional_special_tokens': ['<new_token>']})` only for truly new tokens, not duplicates.
   ```

## Dead Ends

- **** — The parameter `add_special_tokens` controls whether to add tokens to the vocabulary, not whether to check for duplicates; the error persists. (100% fail)
- **** — Deleting built-in special tokens (like [CLS], [SEP]) can break tokenizer functionality; re-adding may still fail if they are already present in the base tokenizer. (40% fail)
