Retrieving embeddings results in warning that input tokens were not marked as input #14821

vlovich · 2025-07-22T18:05:15Z

vlovich
Jul 22, 2025

I'm trying to use the llama.h library in my code to generate embeddings for a piece of text. When I run it, I get the warning "embeddings required but some input tokens were not marked as outputs" but I'm not really sure I understand what I'm doing wrong. What I do is I tokenize my input string (add_special = true), I then populate my batch with those tokens, I call llama_decode on my batch, and then I call embeddings_ith(-1) to get the embeddings for the last token.

Perhaps related perhaps not, I'm also seeing the embedding values changing every time I run which to me suggests I'm not using the API correctly somehow - pretty sure embeddings are deterministic for a given input + model and I confirmed that is the case via llama-embedding & llama-server which do generate deterministic embeddings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Retrieving embeddings results in warning that input tokens were not marked as input #14821

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Retrieving embeddings results in warning that input tokens were not marked as input #14821

Uh oh!

vlovich Jul 22, 2025

Replies: 0 comments

vlovich
Jul 22, 2025