-
Notifications
You must be signed in to change notification settings - Fork 31.2k
Open
Labels
Description
System Info
transformersversion: 4.57.1- Platform: Linux-6.8.0-84-generic-x86_64-with-glibc2.35
- Python version: 3.12.7
- Huggingface_hub version: 0.36.0
- Safetensors version: 0.6.2
- Accelerate version: not installed
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.9.0+cu128 (NA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Python script for reproduction:
from transformers import AutoConfig, AutoModelForCausalLM, GenerationConfig, AutoTokenizer
model_id = "LiquidAI/LFM2-350M"
config = AutoConfig.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, config=config)
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.padding_side = "left"
input_ids = tokenizer(["Today is a nice day and I am longer", "This is me"], return_tensors="pt", padding=True)
gen_config = GenerationConfig(
max_new_tokens=30,
min_new_tokens=30,
num_beams=2,
)
output_ids = model.generate(**input_ids, generation_config=gen_config)
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))
Generate fails with error:
Traceback (most recent call last):
File "/home/panas/git/optimum-test/main_repr.py", line 17, in <module>
output_ids = model.generate(**input_ids, generation_config=gen_config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/generation/utils.py", line 2564, in generate
result = decoding_method(
^^^^^^^^^^^^^^^^
File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/generation/utils.py", line 3377, in _beam_search
model_kwargs["past_key_values"].reorder_cache(beam_idx)
File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/models/lfm2/modeling_lfm2.py", line 216, in reorder_cache
self.key_cache[layer_idx] = self.key_cache[layer_idx].index_select(0, beam_idx.to(device))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: index out of range in self
Process finished with exit code 1
If num_beams=2 is removed from config, generation works normally, so looks like the problem is related to beam search.
Expected behavior
generate() expected to not fail and return generated tokens.