Local LLM with Local Data via MCP

A comprehensive guide for connecting locally deployed Large Language Models (LLMs) with local datasets using the Model Context Protocol (MCP).

🎯 Overview

This repository provides step-by-step instructions for:

Setting up a local development environment with Conda
Installing and configuring llama-cpp-python
Running local language models (demonstrated with Microsoft Phi-4)
Connecting models with local data sources via MCP (coming soon)

📋 Prerequisites

Ubuntu 22.04+ (tested on 22.04.5 LTS)
8GB+ RAM (16GB recommended for larger models)
Internet connection for initial setup
Basic familiarity with command line operations

🛠️ Installation Guide

Step 1: Miniconda Setup

Download Miniconda:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

Install Miniconda:
```
bash Miniconda3-latest-Linux-x86_64.sh
```
Follow the interactive prompts. Accept the license and default installation location unless you have specific requirements.
Reload your shell configuration:
```
source ~/.bashrc
```

Step 2: Environment Setup

Create a dedicated conda environment:
```
conda create -n llama python=3.10
```
Activate the environment:
```
conda activate llama
```
Add conda-forge channel for better package availability:
```
conda config --add channels conda-forge
```

Verify your setup:

conda env list
conda config --show channels

Step 3: llama-cpp-python Installation

Install llama-cpp-python:
```
conda install llama-cpp-python
```
Note: If you have a CUDA-capable GPU, consider installing the CUDA version for better performance:
```
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
```

Download a model (Microsoft Phi-4 example):

mkdir -p models
cd models
wget https://huggingface.co/microsoft/phi-4-gguf/resolve/main/phi-4-q4.gguf
cd ..

🚀 Quick Start

Basic Text Completion

Create a file called test_model.py:

from llama_cpp import Llama

# Initialize the model
model = Llama(
    model_path="./models/phi-4-q4.gguf",
    n_ctx=2048,  # Context window
    n_threads=4,  # Number of CPU threads
    verbose=False
)

# Generate text
prompt = "Q: Explain the de Sitter thermodynamics using the Painlevé-Gullstrand (PG) coordinates"

output = model(
    prompt,
    max_tokens=1000,
    stop=["Q:", "\n\n"],
    echo=False,
    temperature=0.7
)

print("Response:")
print(output['choices'][0]['text'])

Run the script:

python test_model.py

Interactive Chat Session

from llama_cpp import Llama

model = Llama(model_path="./models/phi-4-q4.gguf", verbose=False)

print("Chat with Phi-4 (type 'quit' to exit)")
print("-" * 40)

while True:
    user_input = input("You: ")
    if user_input.lower() == 'quit':
        break
    
    response = model(f"Human: {user_input}\nAssistant:", 
                    max_tokens=500, 
                    stop=["Human:", "\n\n"],
                    temperature=0.7)
    
    print(f"Assistant: {response['choices'][0]['text'].strip()}")

📊 Model Performance Tips

Memory Usage: Quantized models (Q4, Q8) balance performance and memory usage
Context Length: Adjust n_ctx based on your use case and available RAM
Threading: Set n_threads to your CPU core count for optimal performance
GPU Acceleration: Use CUDA or Metal builds for significant speed improvements

🔧 Troubleshooting

Common Issues

Installation fails on conda install:

# Try pip installation instead
pip install llama-cpp-python

Model loading errors:

Verify the model file path is correct
Ensure sufficient RAM is available
Check model file integrity with ls -la models/

Slow inference:

Reduce model size (try Q2_K or Q4_K variants)
Increase n_threads parameter
Consider GPU acceleration

🗺️ Roadmap

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

llama.cpp for the efficient C++ implementation
Microsoft for the Phi-4 model
Hugging Face for model hosting

⚠️ Status: This repository is actively under development. MCP integration features are coming soon!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Local LLM with Local Data via MCP

🎯 Overview

📋 Prerequisites

🛠️ Installation Guide

Step 1: Miniconda Setup

Step 2: Environment Setup

Step 3: llama-cpp-python Installation

🚀 Quick Start

Basic Text Completion

Interactive Chat Session

📊 Model Performance Tips

🔧 Troubleshooting

Common Issues

🗺️ Roadmap

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

License

harisnae/Connect-local-LLM-to-use-local-datasets-via-MCP

Folders and files

Latest commit

History

Repository files navigation

Local LLM with Local Data via MCP

🎯 Overview

📋 Prerequisites

🛠️ Installation Guide

Step 1: Miniconda Setup

Step 2: Environment Setup

Step 3: llama-cpp-python Installation

🚀 Quick Start

Basic Text Completion

Interactive Chat Session

📊 Model Performance Tips

🔧 Troubleshooting

Common Issues

🗺️ Roadmap

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages