Skip to content

Conversation

@prashantpandeygit
Copy link

What does this PR do?

This PR fixes cache reordering causing runtime error when keys and values are on different devices

While working with beam search, with the cache operations, i got this error attached

Screenshot 2026-01-08 210423

this was due to beam_idx was being moved to different devices without considering that keys and values can be on different devices when during memory offloading.

changes i made are just to get the device info from keys and values separately solving the mismatch, hope this helps!

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43177&sha=2f1a9a

@prashantpandeygit
Copy link
Author

@vasqu kindly review when you get a moment, thanks!

Comment on lines -80 to +82
self.keys = self.keys.index_select(0, beam_idx.to(self.keys.device))
self.values = self.values.index_select(0, beam_idx.to(self.values.device))
# Get device for each tensor and then move beam_idx to that device
keys_device = self.keys.device
self.keys = self.keys.index_select(0, beam_idx.to(keys_device))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that exactly the same as what's already happening

self.keys = self.keys.index_select(0, beam_idx.to(self.keys.device))

vs

keys_device = self.keys.device
self.keys = self.keys.index_select(0, beam_idx.to(keys_device))

Do you have a reproducer to check what's going on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants