Skip to content

Have the basics of ollama #4

Open
DanielMarchand wants to merge 39 commits intodrudilorenzo:fix-and-improvefrom
DanielMarchand:fix-and-improve_add-ollama
Open

Have the basics of ollama #4
DanielMarchand wants to merge 39 commits intodrudilorenzo:fix-and-improvefrom
DanielMarchand:fix-and-improve_add-ollama

Conversation

@DanielMarchand
Copy link

@DanielMarchand DanielMarchand commented Jul 14, 2024

The basics work. The problem is the code base is not very well-designed to handled custom prompting depending on the model. For example wake up dates require longer token limits with the llama3 models than with opean ai ones. Also I had to switch from system to assistant in the chat complete to get better answers, there are other subtle differences in how the prompts need to be set up, would be nice to discuss an overall architecture for this. Otherwise I think this is a really cool direction letting people with decent GPUs (tested on 3080, i'm sure 4090 would be even more special) get nice results at no cost.

This is heavily based on joonspk-research#155 by ketsapiwiq I had do some aspects differently but much of the logic is the same

chowington referenced this pull request in crcresearch/agentic_collab Sep 30, 2024
Support vLLM on EC2 instances
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant