🚀 FineTune

A complete, beginner-friendly system for fine-tuning models compatible with Ollama using datasets in JSON, CSV, or Excel format.

✨ Features

🤖 Model Selection: Choose from popular models (LLaMA2, Mistral, Phi, etc.)
📁 Multi-Format Support: Upload JSON, CSV, or Excel datasets
⚙️ Easy Configuration: Interactive column mapping and parameter tuning
🔥 Efficient Training: Uses Unsloth for fast, memory-efficient fine-tuning
🚀 Ollama Integration: Automatic model deployment to Ollama
🖥️ Dual Interface: Both CLI and web-based interfaces
🧪 Built-in Testing: Test your models with custom prompts

🚀 Quick Start

Prerequisites

Install Ollama (if not already installed):

curl -fsSL https://ollama.com/install.sh | sh
ollama serve  # Start Ollama server

Clone and setup FineTune:

git clone <your-repo-url>
cd FineTune
pip install -r requirements.txt

Option 1: Web Interface (Recommended for Beginners)

streamlit run finetune_app.py

Then open your browser to http://localhost:8501 and follow the guided interface!

Option 2: Command Line Interface

python finetune_cli.py

Follow the interactive prompts to:

Select your dataset
Configure column mappings
Choose a model
Start training
Deploy to Ollama

📖 Step-by-Step Guide

1. Prepare Your Dataset

Your dataset can be in any of these formats:

JSON Format

[
  {
    "question": "What is AI?",
    "answer": "AI stands for Artificial Intelligence..."
  },
  {
    "question": "How does ML work?", 
    "answer": "Machine Learning works by..."
  }
]

CSV Format

prompt,response
"Hello, how are you?","I'm doing well, thank you!"
"What's the weather like?","I don't have access to current weather data..."

Excel Format

Create an Excel file with columns like:

Task | Context | Expected_Output
Input | Target
Question | Answer

2. Select Your Model

Choose from these pre-configured models:

LLaMA 2 7B - General purpose, good balance
Mistral 7B - Fast and efficient
Phi 3 Mini - Lightweight, great for beginners
Code Llama 7B - Specialized for code
Gemma 7B - Google's efficient model

3. Configure Training

Basic Settings (good defaults provided):

Max steps: 60 (adjust based on dataset size)
Learning rate: 2e-4
Batch size: 2 (increase if you have more GPU memory)

Advanced Settings:

LoRA rank: 16 (higher = more parameters, longer training)
Sequence length: 2048 tokens

4. Train Your Model

Training will:

Process your dataset into the correct format
Load the model with LoRA adapters
Train efficiently using Unsloth
Save the fine-tuned model

Training Time Estimates:

Small dataset (100 examples): ~5-10 minutes
Medium dataset (1000 examples): ~30-60 minutes
Large dataset (10000 examples): ~2-5 hours

5. Deploy to Ollama

The system will:

Generate a Modelfile for Ollama
Register your model with Ollama
Make it available for chat/API use

6. Test Your Model

# Test with CLI
python test_model.py --model your-model-name --interactive

# Test with Ollama directly
ollama run your-model-name "Hello, how are you?"

📊 Example Datasets

The project includes sample datasets in the datasets/ folder:

sample_qa.json - Question & Answer pairs
sample_chat.csv - Conversational responses
sample_instruction.xlsx - Task-based instructions

Use these to test the system before using your own data!

🛠️ Advanced Usage

CLI Options

# Interactive mode
python finetune_cli.py

# Test a deployed model
python test_model.py --model custom-llama2 --prompt "Hello!"

# Compare multiple models
python test_model.py --compare model1 model2 --file test_prompts.json

# Benchmark performance
python test_model.py --model custom-llama2 --benchmark

Configuration

Edit config.yaml to customize:

Default training parameters
Supported models list
File paths
Ollama settings

Custom Templates

For multi-column inputs, use templates:

"Question: {question} Context: {context}"

This combines multiple columns into a single input.

🔧 Troubleshooting

Common Issues

1. CUDA Out of Memory

Reduce batch size to 1
Enable 4-bit loading
Use a smaller model (Phi 3 Mini)

2. Ollama Connection Failed

Make sure Ollama is running: ollama serve
Check if port 11434 is available
Try restarting Ollama

3. Training Too Slow

Use fewer training steps
Increase batch size (if memory allows)
Try gradient accumulation

4. Poor Model Performance

Increase training steps
Use more/better training data
Try different learning rates
Ensure data quality

Getting Help

Check the example datasets for formatting
Review the configuration in config.yaml
Use the web interface for guided setup
Test with small datasets first

📁 Project Structure

FineTune/
├── src/                     # Core modules
│   ├── config.py           # Configuration management
│   ├── dataset_processor.py # Dataset handling
│   ├── model_manager.py    # Model loading/management
│   ├── trainer.py          # Training engine (Unsloth)
│   └── ollama_integration.py # Ollama deployment
├── datasets/               # Sample datasets
├── models/                 # Downloaded models
├── outputs/                # Training outputs
├── config.yaml            # Main configuration
├── finetune_app.py        # Streamlit web interface
├── finetune_cli.py        # Command line interface
├── test_model.py          # Model testing script
├── test_prompts.json      # Sample test prompts
├── requirements.txt       # Dependencies
└── README.md              # This file

🤝 Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Test thoroughly
Submit a pull request

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

Unsloth - For efficient fine-tuning
Ollama - For local model deployment
Hugging Face - For model ecosystem
Streamlit - For the web interface

Happy Fine-tuning! 🎉

Need help? Check the examples, try the web interface, or create an issue on GitHub.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 FineTune

✨ Features

🚀 Quick Start

Prerequisites

Option 1: Web Interface (Recommended for Beginners)

Option 2: Command Line Interface

📖 Step-by-Step Guide

1. Prepare Your Dataset

JSON Format

CSV Format

Excel Format

2. Select Your Model

3. Configure Training

4. Train Your Model

5. Deploy to Ollama

6. Test Your Model

📊 Example Datasets

🛠️ Advanced Usage

CLI Options

Configuration

Custom Templates

🔧 Troubleshooting

Common Issues

Getting Help

📁 Project Structure

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
datasets		datasets
outputs		outputs
src		src
unsloth_compiled_cache		unsloth_compiled_cache
README.md		README.md
USAGE_EXAMPLES.md		USAGE_EXAMPLES.md
config.yaml		config.yaml
demo_instruction_format.py		demo_instruction_format.py
finetune_cli.py		finetune_cli.py
requirements.txt		requirements.txt
setup.py		setup.py
test_model.py		test_model.py
test_prompts.json		test_prompts.json
test_quick.py		test_quick.py
train_cyber_direct.py		train_cyber_direct.py

Folders and files

Latest commit

History

Repository files navigation

🚀 FineTune

✨ Features

🚀 Quick Start

Prerequisites

Option 1: Web Interface (Recommended for Beginners)

Option 2: Command Line Interface

📖 Step-by-Step Guide

1. Prepare Your Dataset

JSON Format

CSV Format

Excel Format

2. Select Your Model

3. Configure Training

4. Train Your Model

5. Deploy to Ollama

6. Test Your Model

📊 Example Datasets

🛠️ Advanced Usage

CLI Options

Configuration

Custom Templates

🔧 Troubleshooting

Common Issues

Getting Help

📁 Project Structure

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages