Skip to content

Conversation

@bpblanken
Copy link
Collaborator

@bpblanken bpblanken commented Nov 21, 2025

Unifies duplicate json prompting and settings definitions.

@bpblanken bpblanken changed the title chore: refactor default settings refactor: clean up default settings Nov 21, 2025
@bpblanken bpblanken changed the title refactor: clean up default settings refactor: clean up default settings & duplicate json prompting. Nov 22, 2025
@github-actions
Copy link

Code Coverage

Package Line Rate Health
. 99%
evagg 89%
evagg.content 99%
evagg.library 100%
evagg.llm 72%
evagg.prompts 100%
evagg.ref 94%
evagg.types 100%
evagg.utils 95%
Summary 95% (2180 / 2298)

Minimum allowed line rate is 50%

pass


class PromptSpec(NamedTuple):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is forward looking and doesn't necessarily belong on this pr. I've got some thoughts on pydantic -> prompt -> pydantic. There's some cleanup work ahead, but I think pydantic prompting should make everything easier to read and maintain.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defer to @theferrit32 to review since he has some experience with pydantic on the vrs-python work.

)

prompts = mock_prompt("{invalid json")
prompts = mock_prompt({})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd need to change how the mocks work to do better than just returning {} from the mock call to prompt_json.

self._variant_factory = variant_factory
self._variant_comparator = variant_comparator

async def _run_json_prompt(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function was wholesale duplicated, now only in one place!

async def prompt_file(
self,
user_prompt_file: str,
system_prompt: Optional[str] = None,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hardcoded the system prompt. I can't see why we'd need to support multiple for this.

{"role": "system", "content": "Extract field"},
{
"role": "system",
"content": "You are an intelligent assistant to a genetic analyst. Their task is to identify the genetic variant or variants that\nare causing a patient's disease. One approach they use to solve this problem is to seek out evidence from the academic\nliterature that supports (or refutes) the potential causal role that a given variant is playing in a patient's disease.\n\nAs part of that process, you will assist the analyst in collecting specific details about genetic variants that have\nbeen observed in the literature.\n\nAll of your responses should be provided in the form of a JSON object. These responses should never include long,\nuninterrupted sequences of whitespace characters.",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the one thing that's maybe now broken is the library relevance check. the prompt is currently written to not force json, but the system prompt is changing and the parsing functionality is changing. I think we might be better off just removing the relevance check from this repo though to avoid the maintenance.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't want to slow things down, but I'd like for you to explain this at some point at a future meeting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a good question. I think Ashley will be helpful here. From what I could tell the "rare disease relevance" prompting just flowed through a different (potentially older?) code path that didn't produce JSON and instead parsed a simple string. I think it's better to be consistent with the output format, but wasn't even sure we will need a "rare disease relevance" prompt with a bring your own paper approach.

Copy link

@larrybabb larrybabb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defering to @theferrit32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants