-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: clean up default settings & duplicate json prompting. #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: benb/remove_embeddings
Are you sure you want to change the base?
Conversation
| pass | ||
|
|
||
|
|
||
| class PromptSpec(NamedTuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is forward looking and doesn't necessarily belong on this pr. I've got some thoughts on pydantic -> prompt -> pydantic. There's some cleanup work ahead, but I think pydantic prompting should make everything easier to read and maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to @theferrit32 to review since he has some experience with pydantic on the vrs-python work.
| ) | ||
|
|
||
| prompts = mock_prompt("{invalid json") | ||
| prompts = mock_prompt({}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'd need to change how the mocks work to do better than just returning {} from the mock call to prompt_json.
| self._variant_factory = variant_factory | ||
| self._variant_comparator = variant_comparator | ||
|
|
||
| async def _run_json_prompt( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function was wholesale duplicated, now only in one place!
| async def prompt_file( | ||
| self, | ||
| user_prompt_file: str, | ||
| system_prompt: Optional[str] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hardcoded the system prompt. I can't see why we'd need to support multiple for this.
| {"role": "system", "content": "Extract field"}, | ||
| { | ||
| "role": "system", | ||
| "content": "You are an intelligent assistant to a genetic analyst. Their task is to identify the genetic variant or variants that\nare causing a patient's disease. One approach they use to solve this problem is to seek out evidence from the academic\nliterature that supports (or refutes) the potential causal role that a given variant is playing in a patient's disease.\n\nAs part of that process, you will assist the analyst in collecting specific details about genetic variants that have\nbeen observed in the literature.\n\nAll of your responses should be provided in the form of a JSON object. These responses should never include long,\nuninterrupted sequences of whitespace characters.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the one thing that's maybe now broken is the library relevance check. the prompt is currently written to not force json, but the system prompt is changing and the parsing functionality is changing. I think we might be better off just removing the relevance check from this repo though to avoid the maintenance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't want to slow things down, but I'd like for you to explain this at some point at a future meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good question. I think Ashley will be helpful here. From what I could tell the "rare disease relevance" prompting just flowed through a different (potentially older?) code path that didn't produce JSON and instead parsed a simple string. I think it's better to be consistent with the output format, but wasn't even sure we will need a "rare disease relevance" prompt with a bring your own paper approach.
larrybabb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defering to @theferrit32
Unifies duplicate json prompting and settings definitions.