GenRAGrs 🦀

A modular Rust library for Retrieval-Augmented Generation (RAG) with Ollama integration

GenRAGrs is a production-ready, modular RAG library designed for building intelligent applications that can understand and query your documents. It combines vector similarity search with large language models to provide contextually accurate responses based on your data.

Features

🚀 Ollama Integration: Seamless integration with local Ollama models
🧩 Modular Architecture: Pluggable components for embeddings, storage, retrieval, and chat
📚 Multi-format Support: Text files, Markdown, and code files
💬 Smart Chunking: Intelligent document splitting with configurable overlap
🔍 Advanced Retrieval: Semantic search with optional reranking and hybrid retrieval
🎯 Flexible Prompting: Customizable prompt templates for different use cases
⚡ Async Performance: Built with Tokio for high-performance async operations
🛠️ CLI Tool: Ready-to-use command-line interface

Quick Start

Prerequisites

Rust: Install from rustup.rs
Ollama: Install from ollama.ai

Installation on Ubuntu

# Install Rust (if not already installed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source ~/.cargo/env

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull required models
ollama pull nomic-embed-text  # For embeddings
ollama pull qwen3:0.6b       # For chat (default model)

# Clone and build GenRAGrs
git clone <your-repo-url>
cd genragrs
cargo build --release

Running the Example

# Process documents and start interactive chat
cargo run --example rag_cli -- --rag-files ./README.md --rag-files ./src --recursive

# With specific file types
cargo run --example rag_cli -- --rag-files ./docs --recursive --extensions md,txt,rs

# Using different models
cargo run --example rag_cli -- --rag-files ./docs --chat-model llama2:7b --embed-model nomic-embed-text

# Show help
cargo run --example rag_cli -- --help

Example Chat Session

🚀 Welcome to RAG Chat!
💬 Ask questions about your documents
📋 Commands: 'exit'/'quit' to exit, 'help' for help, 'context <question>' to see sources
======================================================================

💬 You: What is GenRAGrs?
🤖 Assistant: GenRAGrs is a modular Rust library for Retrieval-Augmented Generation (RAG) with Ollama integration. It's designed to be production-ready and allows you to build intelligent applications that can understand and query your documents...

💬 You: context What is GenRAGrs?
📚 Context for: What is GenRAGrs?
==================================================
📄 Source 1: README.md
🎯 Score: 0.923
📝 Content: GenRAGrs is a modular Rust library for Retrieval-Augmented Generation (RAG) with Ollama integration. It combines vector similarity search with large language models...
------------------------------

💬 You: exit
👋 Goodbye!

Library Documentation

Core Architecture

GenRAGrs follows a modular design with these key components:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Documents     │───▶│   Embeddings    │───▶│  Vector Store   │
│                 │    │                 │    │                 │
│ • TextLoader    │    │ • OllamaEmbedder│    │ • InMemoryStore │
│ • MarkdownLoader│    │ • Batch Support │    │ • Cosine Sim    │
│ • TextSplitter  │    │                 │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                                        │
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│      Chat       │◀───│   Retriever     │◀───│                 │
│                 │    │                 │    │                 │
│ • SimpleChat    │    │ • SimpleRetriever│    │                 │
│ • Orchestrator  │    │ • HybridRetriever│    │                 │
│ • Sessions      │    │ • Reranking     │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │
         ▼
┌─────────────────┐    ┌─────────────────┐
│    Models       │    │    Prompts      │
│                 │    │                 │
│ • OllamaModel   │    │ • Templates     │
│ • ModelConfig   │    │ • QA, Code, etc │
│ • MockModel     │    │ • Custom vars   │
└─────────────────┘    └─────────────────┘

Basic Usage

Simple RAG System

use genragrs::prelude::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize with default configuration
    let mut rag = RagSystem::default().await?;
    
    // Add documents
    rag.add_text_document("Rust is a systems programming language.").await?;
    rag.add_text_file("path/to/document.txt").await?;
    rag.add_markdown_file("README.md").await?;
    
    // Query the system
    let response = rag.query("What is Rust?").await?;
    println!("Answer: {}", response);
    
    // Start a conversation
    let response = rag.chat("Tell me more about systems programming").await?;
    println!("Response: {}", response);
    
    Ok(())
}

Custom Configuration

use genragrs::prelude::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let config = RagConfig::new()
        .with_ollama_url("http://localhost:11434".to_string())
        .with_chat_model("llama2:7b".to_string())
        .with_embedding_model("nomic-embed-text".to_string())
        .with_chunk_settings(512, 100)
        .with_prompt_template(PromptTemplates::code_assistant_template());
    
    let mut rag = RagSystem::new(config).await?;
    
    // Rest of your code...
    Ok(())
}

API Reference

Core Types

`RagSystem`

The main entry point for the RAG functionality.

impl RagSystem {
    // Construction
    pub async fn new(config: RagConfig) -> Result<Self>;
    pub async fn default() -> Result<Self>;
    
    // Document management
    pub async fn add_text_document(&mut self, content: &str) -> Result<()>;
    pub async fn add_text_file(&mut self, file_path: &str) -> Result<()>;
    pub async fn add_markdown_file(&mut self, file_path: &str) -> Result<()>;
    pub async fn clear(&mut self) -> Result<()>;
    pub async fn has_documents(&self) -> bool;
    
    // Querying
    pub async fn query(&self, question: &str) -> Result<String>;
    pub async fn chat(&self, message: &str) -> Result<String>;
    pub async fn get_context(&self, query: &str) -> Result<Vec<SearchResult>>;
}

`RagConfig`

Configuration for the RAG system.

#[derive(Debug, Clone)]
pub struct RagConfig {
    pub ollama_base_url: String,
    pub embedding_model: String,
    pub chat_model: String,
    pub chunk_size: usize,
    pub chunk_overlap: usize,
    pub retrieval_config: RetrievalConfig,
    pub model_config: ModelConfig,
    pub prompt_template: PromptTemplate,
}

impl RagConfig {
    pub fn new() -> Self;
    pub fn with_ollama_url(self, url: String) -> Self;
    pub fn with_embedding_model(self, model: String) -> Self;
    pub fn with_chat_model(self, model: String) -> Self;
    pub fn with_chunk_settings(self, size: usize, overlap: usize) -> Self;
    pub fn with_prompt_template(self, template: PromptTemplate) -> Self;
    pub fn with_retrieval_config(self, config: RetrievalConfig) -> Self;
}

Document Processing

`TextSplitter`

Intelligent text chunking with overlap support.

impl TextSplitter {
    pub fn new(chunk_size: usize, chunk_overlap: usize) -> Self;
    pub fn with_separators(self, separators: Vec<String>) -> Self;
    pub fn split_text(&self, text: &str) -> Vec<String>;
}

Document Loaders

// Text files
let loader = TextFileLoader::new(Some(text_splitter));
let documents = loader.load("path/to/file.txt").await?;

// Markdown files with header extraction
let loader = MarkdownLoader::new(Some(text_splitter));
let documents = loader.load("path/to/README.md").await?;

Embeddings

`OllamaEmbedder`

impl OllamaEmbedder {
    pub fn new(base_url: Option<String>, model: Option<String>) -> Self;
    pub fn with_model(self, model: String) -> Self;
}

#[async_trait]
impl Embedder for OllamaEmbedder {
    async fn embed(&self, text: &str) -> Result<Embedding>;
    async fn embed_batch(&self, texts: &[String]) -> Result<Vec<Embedding>>;
}

Vector Storage

`InMemoryVectorStore`

impl InMemoryVectorStore {
    pub fn new() -> Self;
    pub fn is_empty(&self) -> bool;
    pub fn len(&self) -> usize;
}

#[async_trait]
impl VectorStore for InMemoryVectorStore {
    async fn add_document(&mut self, document: Document) -> Result<()>;
    async fn add_documents(&mut self, documents: Vec<Document>) -> Result<()>;
    async fn search(&self, query_embedding: &Embedding, top_k: usize) -> Result<Vec<SearchResult>>;
    async fn get_document(&self, id: &str) -> Result<Option<Document>>;
    async fn delete_document(&mut self, id: &str) -> Result<bool>;
    async fn clear(&mut self) -> Result<()>;
}

Retrieval

`RetrievalConfig`

#[derive(Debug, Clone)]
pub struct RetrievalConfig {
    pub top_k: usize,        // Number of documents to retrieve (default: 5)
    pub score_threshold: Option<f32>,  // Minimum similarity score
    pub rerank: bool,        // Enable reranking (default: false)
}

Retrievers

// Simple semantic retrieval
let retriever = SimpleRetriever::new(embedder, vector_store, Some(config));

// Hybrid retrieval (semantic + keyword)
let hybrid_retriever = HybridRetriever::new(semantic_retriever, Some(0.3));

Prompt Templates

Built-in Templates

// Q&A template (default)
let template = PromptTemplates::qa_template();

// Code assistant
let template = PromptTemplates::code_assistant_template();

// Research assistant
let template = PromptTemplates::research_template();

// Summarization
let template = PromptTemplates::summarization_template();

Custom Templates

let template = PromptTemplate::new()
    .with_system_prompt("You are a helpful assistant.".to_string())
    .with_user_template("Context: {context}\n\nQuestion: {question}\n\nAnswer:".to_string())
    .with_context_template("Source: {source}\n{content}".to_string());

Prompt Builder with Variables

let builder = PromptBuilder::new(template)
    .set_variable("domain".to_string(), "medical".to_string())
    .set_variable("style".to_string(), "formal".to_string());

let messages = builder.build("What is diabetes?", &search_results)?;

Language Models

`OllamaModel`

impl OllamaModel {
    pub fn new(base_url: Option<String>, default_config: Option<ModelConfig>) -> Self;
    pub fn with_model(self, model_name: String) -> Self;
}

#[async_trait]
impl LanguageModel for OllamaModel {
    async fn generate(&self, messages: ChatMessages) -> Result<ModelResponse>;
    async fn generate_with_config(&self, messages: ChatMessages, config: &ModelConfig) -> Result<ModelResponse>;
}

`ModelConfig`

#[derive(Debug, Clone)]
pub struct ModelConfig {
    pub model_name: String,
    pub temperature: Option<f32>,
    pub max_tokens: Option<u32>,
    pub top_p: Option<f32>,
    pub stream: bool,
}

impl ModelConfig {
    pub fn new(model_name: String) -> Self;
    pub fn with_temperature(self, temperature: f32) -> Self;
    pub fn with_max_tokens(self, max_tokens: u32) -> Self;
    pub fn with_streaming(self, stream: bool) -> Self;
}

Chat Management

`SimpleChat`

impl SimpleChat {
    pub fn new(retriever: Arc<dyn Retriever>, model: Arc<dyn LanguageModel>, config: Option<ChatConfig>) -> Self;
    pub async fn ask(&self, question: &str) -> Result<String>;
    pub async fn chat(&self, message: &str) -> Result<String>;
    pub async fn get_context(&self, query: &str) -> Result<Vec<SearchResult>>;
}

`ChatOrchestrator`

For advanced chat management with sessions:

impl ChatOrchestrator {
    pub fn new(retriever: Arc<dyn Retriever>, model: Arc<dyn LanguageModel>, config: Option<ChatConfig>) -> Self;
    pub async fn create_session(&self) -> String;
    pub async fn chat(&self, session_id: &str, message: &str) -> Result<ModelResponse>;
    pub async fn query(&self, question: &str) -> Result<ModelResponse>;
    pub async fn get_session(&self, session_id: &str) -> Option<ChatSession>;
    pub async fn delete_session(&self, session_id: &str) -> bool;
}

Advanced Usage

Custom Retriever

use async_trait::async_trait;

struct CustomRetriever {
    // Your custom fields
}

#[async_trait]
impl Retriever for CustomRetriever {
    async fn retrieve(&self, query: &str) -> Result<Vec<SearchResult>> {
        // Your custom retrieval logic
    }
    
    async fn retrieve_with_config(&self, query: &str, config: &RetrievalConfig) -> Result<Vec<SearchResult>> {
        // Your custom retrieval with config
    }
}

Custom Vector Store

use async_trait::async_trait;

struct CustomVectorStore {
    // Your custom storage implementation
}

#[async_trait]
impl VectorStore for CustomVectorStore {
    async fn add_document(&mut self, document: Document) -> Result<()> {
        // Your storage logic
    }
    
    async fn search(&self, query_embedding: &Embedding, top_k: usize) -> Result<Vec<SearchResult>> {
        // Your search logic
    }
    
    // ... implement other required methods
}

Batch Processing

use genragrs::prelude::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut rag = RagSystem::default().await?;
    
    // Process multiple files
    let files = vec!["doc1.txt", "doc2.md", "doc3.py"];
    for file in files {
        if file.ends_with(".md") {
            rag.add_markdown_file(file).await?;
        } else {
            rag.add_text_file(file).await?;
        }
    }
    
    // Batch queries
    let questions = vec![
        "What is the main topic?",
        "How does this work?",
        "What are the key features?",
    ];
    
    for question in questions {
        let answer = rag.query(question).await?;
        println!("Q: {}\nA: {}\n", question, answer);
    }
    
    Ok(())
}

CLI Reference

Commands

# Basic usage - process files and start chat
rag-cli --rag-files <PATH> [OPTIONS]

# Process only without chat
rag-cli --rag-files <PATH> process-only

# Show configuration
rag-cli config --show

Options

Options:
  -r, --rag-files <PATH>         RAG files or directories to process
  -R, --recursive                Recursively process directories
  -e, --extensions <LIST>        File extensions to include (comma-separated)
      --ollama-url <URL>         Ollama base URL [default: http://localhost:11434]
      --embed-model <MODEL>      Embedding model [default: nomic-embed-text]
      --chat-model <MODEL>       Chat model [default: qwen3:0.6b]
  -v, --verbose                  Enable verbose logging
  -h, --help                     Print help

Examples

# Process current directory with common file types
rag-cli --rag-files . --recursive

# Process specific files
rag-cli --rag-files README.md --rag-files src/lib.rs

# Process with custom extensions
rag-cli --rag-files ./docs --recursive --extensions md,txt,rst

# Use different models
rag-cli --rag-files ./code --chat-model llama2:7b --embed-model nomic-embed-text

# Verbose logging
rag-cli --rag-files ./docs --verbose

Chat Commands

When in interactive chat mode:

exit or quit - Exit chat
help - Show help
context <question> - Show source documents for a question
Any other text - Send message to assistant

Supported File Types

Extension	Description	Processor	Features
`.txt`	Plain text	TextFileLoader	Basic chunking
`.md`	Markdown	MarkdownLoader	Header extraction, metadata
`.py`	Python	TextFileLoader	Syntax-aware chunking
`.rs`	Rust	TextFileLoader	Syntax-aware chunking
`.js`	JavaScript	TextFileLoader	Syntax-aware chunking
`.ts`	TypeScript	TextFileLoader	Syntax-aware chunking
`.java`	Java	TextFileLoader	Syntax-aware chunking
`.cpp`, `.c`, `.h`	C/C++	TextFileLoader	Syntax-aware chunking

Performance Tuning

Chunking Strategy

// For code files - smaller chunks for precise retrieval
let config = RagConfig::new().with_chunk_settings(512, 100);

// For documentation - larger chunks for context
let config = RagConfig::new().with_chunk_settings(1500, 300);

// For mixed content - balanced approach
let config = RagConfig::new().with_chunk_settings(1000, 200); // default

Retrieval Tuning

let retrieval_config = RetrievalConfig {
    top_k: 10,                    // More documents for complex queries
    score_threshold: Some(0.7),   // Filter low-relevance results
    rerank: true,                 // Enable reranking for better quality
};

Model Selection

Fast Models (Low Resource)

qwen3:0.6b (default) - Fast, good for simple Q&A
phi:2.7b - Balanced speed and quality

Quality Models (More Resources)

llama2:7b - High quality responses
mistral:7b - Good for technical content
codellama:7b - Specialized for code

Embedding Models

nomic-embed-text (default) - General purpose
all-minilm - Fast, smaller embeddings

Error Handling

use genragrs::prelude::*;

match rag.query("test").await {
    Ok(response) => println!("Response: {}", response),
    Err(RagError::Http(e)) => eprintln!("Network error: {}", e),
    Err(RagError::Embedding(e)) => eprintln!("Embedding error: {}", e),
    Err(RagError::Model(e)) => eprintln!("Model error: {}", e),
    Err(e) => eprintln!("Other error: {}", e),
}

Troubleshooting

Common Issues

"Failed to initialize RAG system"

# Check if Ollama is running
curl http://localhost:11434/api/tags

# Start Ollama if needed
ollama serve

"Embedding error"

# Pull the embedding model
ollama pull nomic-embed-text

# Check available models
ollama list

"Model error"

# Pull the chat model
ollama pull qwen3:0.6b

# Use a different model
rag-cli --chat-model llama2:7b --rag-files ./docs

Low Quality Responses

Try enabling reranking: use --verbose to see retrieval scores
Adjust chunk size for your content type
Use a higher quality model like llama2:7b
Increase top_k for complex queries

Debug Mode

# Enable debug logging
RUST_LOG=debug cargo run --example rag_cli -- --rag-files ./docs --verbose

Contributing

Fork the repository
Create a feature branch: git checkout -b feature/new-feature
Make your changes and add tests
Run tests: cargo test
Run clippy: cargo clippy
Format code: cargo fmt
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Ollama for local LLM inference
Tokio for async runtime
The Rust community for excellent crates and tools

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
examples		examples
src		src
COPYING		COPYING
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

License

aryalaadi/genragrs

Folders and files

Latest commit

History

Repository files navigation

GenRAGrs 🦀

Features

Quick Start

Prerequisites

Installation on Ubuntu

Running the Example

Example Chat Session

Library Documentation

Core Architecture

Basic Usage

Simple RAG System

Custom Configuration

API Reference

Core Types

RagSystem

RagConfig

Document Processing

TextSplitter

Document Loaders

Embeddings

OllamaEmbedder

Vector Storage

InMemoryVectorStore

Retrieval

RetrievalConfig

Retrievers

Prompt Templates

Built-in Templates

Custom Templates

Prompt Builder with Variables

Language Models

OllamaModel

ModelConfig

Chat Management

SimpleChat

ChatOrchestrator

Advanced Usage

Custom Retriever

Custom Vector Store

Batch Processing

CLI Reference

Commands

Options

Examples

Chat Commands

Supported File Types

Performance Tuning

Chunking Strategy

Retrieval Tuning

Model Selection

Error Handling

Troubleshooting

Common Issues

Debug Mode

Contributing

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`RagSystem`

`RagConfig`

`TextSplitter`

`OllamaEmbedder`

`InMemoryVectorStore`

`RetrievalConfig`

`OllamaModel`

`ModelConfig`

`SimpleChat`

`ChatOrchestrator`

Packages