Skip to content

Lfm2 fails with beam search #42257

@popovaan

Description

@popovaan

System Info

  • transformers version: 4.57.1
  • Platform: Linux-6.8.0-84-generic-x86_64-with-glibc2.35
  • Python version: 3.12.7
  • Huggingface_hub version: 0.36.0
  • Safetensors version: 0.6.2
  • Accelerate version: not installed
  • Accelerate config: not found
  • DeepSpeed version: not installed
  • PyTorch version (accelerator?): 2.9.0+cu128 (NA)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Python script for reproduction:

from transformers import AutoConfig, AutoModelForCausalLM, GenerationConfig, AutoTokenizer

model_id = "LiquidAI/LFM2-350M"
config = AutoConfig.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, config=config)
tokenizer = AutoTokenizer.from_pretrained(model_id)

tokenizer.padding_side = "left"
input_ids = tokenizer(["Today is a nice day and I am longer", "This is me"], return_tensors="pt", padding=True)

gen_config = GenerationConfig(
    max_new_tokens=30,
    min_new_tokens=30,
    num_beams=2,
)

output_ids = model.generate(**input_ids, generation_config=gen_config)
print(tokenizer.decode(output_ids[0], skip_special_tokens=False))

Generate fails with error:

Traceback (most recent call last):
  File "/home/panas/git/optimum-test/main_repr.py", line 17, in <module>
    output_ids = model.generate(**input_ids, generation_config=gen_config)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/generation/utils.py", line 2564, in generate
    result = decoding_method(
             ^^^^^^^^^^^^^^^^
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/generation/utils.py", line 3377, in _beam_search
    model_kwargs["past_key_values"].reorder_cache(beam_idx)
  File "/home/panas/venv/new_py3_12/lib/python3.12/site-packages/transformers/models/lfm2/modeling_lfm2.py", line 216, in reorder_cache
    self.key_cache[layer_idx] = self.key_cache[layer_idx].index_select(0, beam_idx.to(device))
                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: index out of range in self

Process finished with exit code 1

If num_beams=2 is removed from config, generation works normally, so looks like the problem is related to beam search.

Expected behavior

generate() expected to not fail and return generated tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions