How do I replace a spare tokens?

### System Info

I want to SFT Mistral-v0.3 with my own chat template.
So I followed [this comment](https://github.com/huggingface/transformers/issues/27974#issuecomment-1854188941) and replaced some [controal_n] tokens with special tokens for the chat template.
However, the new vocabulary was actually added and the size of the vocabulary increased.
Is there any way to replace the vocabulary?

### Who can help?

@ArthurZucker

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

tokenizer.json
```
{
  "version": "1.0",
  "truncation": null,
  "padding": null,
  "added_tokens": [
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
{
      "id": 10,
      "content": "<|system|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    {
      "id": 11,
      "content": "<|user|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    {
      "id": 12,
      "content": "<|assistant|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
    {
      "id": 13,
      "content": "<|eot|>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": true
    },
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```

tokenizer_config.json
```
{
  "add_bos_token": true,
  "add_eos_token": false,
  "add_prefix_space": true,
  "added_tokens_decoder": {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    "10": {
          "content": "<|system|>",
          "lstrip": false,
          "normalized": false,
          "rstrip": false,
          "single_word": false,
          "special": true
        },
        "11": {
          "content": "<|user|>",
          "lstrip": false,
          "normalized": false,
          "rstrip": false,
          "single_word": false,
          "special": true
        },
        "12": {
          "content": "<|assistant|>",
          "lstrip": false,
          "normalized": false,
          "rstrip": false,
          "single_word": false,
          "special": true
        },
        "13": {
          "content": "<|eot|>",
          "lstrip": false,
          "normalized": false,
          "rstrip": false,
          "single_word": false,
          "special": true
        },
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
}
```

test code 
```
tokenizer =  AutoTokenizer.from_pretrained(model_dir)
pprint(tokenizer.added_tokens_decoder)
```
output
```
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 768: AddedToken("[control_766]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 769: AddedToken("[control_767]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 770: AddedToken("[control_768]", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 32768: AddedToken("<|system|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 32769: AddedToken("<|user|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 32770: AddedToken("<|assistant|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
 32771: AddedToken("<|eot|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True)}
```


### Expected behavior

[control_n] Tokens can be replaced with any token.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How do I replace a spare tokens? #31475

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How do I replace a spare tokens? #31475

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions