Skip to content

LLama output embeddings shape keeps increasing with multiple image inputs unless model is reloaded #558

@simonefelicioni

Description

@simonefelicioni

Hi, first of all, thank you for your amazing work!

We are encountering an issue related to memory usage and the shape of self.model.llama_model.output_embs. Here's the behavior we're observing: we initialize the model and input an image (via img_list) and a prompt. We then extract a result, and compute the size of self.model.llama_model.output_embs. If we clear img_list, then input a second image and prompt, we correctly get a description of the new image.

However, the shape of current self.model.llama_model.output_embs keeps increasing with each new image processed — eventually increasing in size and leading to memory issues by repeating this process for multiple images.

On the other hand, if we fully reinitialize and reload the model before each image, then this issue does not occur, and the shape of output_embs stays consistent and reasonable across inputs. This behavior happens both in the Gradio demo and in our own custom Python script (without Gradio).

Our Questions:
Why does this happen?
Is there a variable or internal state in the model that needs to be reset between inputs, to avoid the accumulation in output_embs?

Our Goal:
Process multiple independent images (not frames of a video, just unrelated images), and save self.model.llama_model.output_embs for each one, without having to reload the entire model every time (which is quite time-consuming).

We would really appreciate any advice on how to clear the internal state properly between runs. Thanks in advance for your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions