This repository contains a Jupyter Notebook (LLM_Execution.ipynb) designed to interact with large language models (LLMs) locally using the Ollama platform.
The notebook provides a simple Python script that:
- Verifies Ollama Server: Checks if the Ollama server is running locally.
- Interactive Prompting: Allows the user to enter prompts in a loop.
- Streaming Responses: Sends the prompts to a specified Ollama model and displays the response in real-time as it's generated (streaming).
- Model Selection: Uses the
gemma3:4b-it-qatmodel by default, but this can be easily changed in the script. - Error Handling: Includes basic error handling for common issues like the Ollama server not running or the specified model not being available locally.
-
Install Ollama: Follow the official instructions at ollama.com or use the command provided in the original README:
curl -fsSL https://ollama.com/install.sh | sh -
Run Ollama Server: Before running the notebook, ensure the Ollama server is running in your terminal:
ollama serve
Keep this terminal window open while using the notebook.
-
Run Ollama Server: Before trying to pull the model, ensure that you have the env variable MODEL_NAME:
echo $MODEL_NAME
If you don't have the env variable MODEL_NAME execute this in the terminal and after try again to see if you have the variable
. .env -
Pull the LLM Model: The notebook is configured to use
gemma3:4b-it-qat. Download it using the Ollama CLI:ollama pull $MODEL_NAMENote: This model requires approximately 3.5 GiB of available system memory. You can change the
MODEL_NAMEvariable in the second code cell of the notebook to use a different model available through Ollama (e.g.,llama3,mistral,phi3). Make sure to pull the desired model first usingollama pull <model_name>. -
Run the Notebook:
- Open the
LLM_Execution.ipynbfile. - Run the code cells sequentially.
- Type your prompt and press Enter. The model's response will stream below.
- Type
exitand press Enter to exit the script.
- Open the