Lfm2 fails with beam search

### System Info

- `transformers` version: 4.57.1
- Platform: Linux-6.8.0-84-generic-x86_64-with-glibc2.35
- Python version: 3.12.7
- Huggingface_hub version: 0.36.0
- Safetensors version: 0.6.2
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.9.0+cu128 (NA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Python script for reproduction:

```
from transformers import AutoConfig, AutoModelForCausalLM, GenerationConfig, AutoTokenizer

model_id = "LiquidAI/LFM2-350M"
config = AutoConfig.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, config=config)
tokenizer = AutoTokenizer.from_pretrained(model_id)

tokenizer.padding_side = "left"
input_ids = tokenizer(["Today is a nice day and I am longer", "This is me"], return_tensors="pt", padding=True)

gen_config = GenerationConfig(
    max_new_tokens=30,
    min_new_tokens=30,
    num_beams=2,
)

output_ids = model.generate(**input_ids, generation_config=gen_config)
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))
```

Generate fails with error:

```
Traceback (most recent call last):
  File "/home/panas/git/optimum-test/main_repr.py", line 17, in <module>
    output_ids = model.generate(**input_ids, generation_config=gen_config)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/generation/utils.py", line 2564, in generate
    result = decoding_method(
             ^^^^^^^^^^^^^^^^
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/generation/utils.py", line 3377, in _beam_search
    model_kwargs["past_key_values"].reorder_cache(beam_idx)
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/models/lfm2/modeling_lfm2.py", line 216, in reorder_cache
    self.key_cache[layer_idx] = self.key_cache[layer_idx].index_select(0, beam_idx.to(device))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: index out of range in self

Process finished with exit code 1
```

If `num_beams=2` is removed from config, generation works normally, so looks like the problem is related to beam search.


### Expected behavior

generate() expected to not fail and return generated `tokens.`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lfm2 fails with beam search #42257

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lfm2 fails with beam search #42257

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions