A FastAPI service to interact with Large Language Models (LLMs). It exposes endpoints to configure the target LLM (endpoint, model, API key) and to send prompts.
- Create and activate a virtual environment:
python3 -m venv venv
source venv/bin/activate- Install dependencies:
pip install -r requirements.txtuvicorn main:app --reloadThe server runs at http://localhost:8000.
GET /— Welcome messageGET /health— Health checkPOST /configure— Update endpoint, model, and api_key (all optional)POST /configure/endpoint— Update only endpointPOST /configure/model— Update only modelPOST /configure/key— Update only api_keyGET /configure/get— Fetch current endpoint, model, and api_keyPOST /ask— Send a prompt to the configured LLM and get the response
Update endpoint/model/api_key:
curl -X POST http://localhost:8000/configure \
-H "Content-Type: application/json" \
-d '{"endpoint": "http://127.0.0.1:8011/v1/chat/completions", "model": "gpt-4.1", "api_key": "your-key"}'Ask the LLM:
curl -X POST http://localhost:8000/ask \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello, how are you?"}'- The configuration is stored in memory; restart resets values to defaults defined in
main.py. - Ensure your editor uses the project virtualenv to resolve imports and suppress linter warnings.