ragit

RAG toolkit for Python. Document loading, chunking, vector search, LLM integration.

Installation

pip install ragit

Quick Start

You must provide an embedding source: custom function, Ollama, or any provider.

Custom Embedding Function

from ragit import RAGAssistant

def my_embed(text: str) -> list[float]:
    # Use any embedding API: OpenAI, Cohere, HuggingFace, etc.
    return embedding_vector

assistant = RAGAssistant("docs/", embed_fn=my_embed)
results = assistant.retrieve("search query")

With LLM for Q&A

def my_embed(text: str) -> list[float]:
    return embedding_vector

def my_generate(prompt: str, system_prompt: str = "") -> str:
    return llm_response

assistant = RAGAssistant("docs/", embed_fn=my_embed, generate_fn=my_generate)
answer = assistant.ask("How does authentication work?")

With Ollama (nomic-embed-text)

from ragit import RAGAssistant
from ragit.providers import OllamaProvider

# Uses nomic-embed-text for embeddings (768d)
assistant = RAGAssistant("docs/", provider=OllamaProvider())
results = assistant.retrieve("search query")

Core API

assistant = RAGAssistant(
    documents,           # Path, list of Documents, or list of Chunks
    embed_fn=...,        # Embedding function: (str) -> list[float]
    generate_fn=...,     # LLM function: (prompt, system_prompt) -> str
    provider=...,        # Or use a provider instead of functions
    chunk_size=512,
    chunk_overlap=50
)

results = assistant.retrieve(query, top_k=3)      # [(Chunk, score), ...]
context = assistant.get_context(query, top_k=3)   # Formatted string
answer = assistant.ask(question, top_k=3)         # Requires generate_fn/LLM
code = assistant.generate_code(request)           # Requires generate_fn/LLM

Index Persistence

Save and load indexes to avoid re-computing embeddings:

# Save index to disk
assistant.save_index("./my_index")

# Load index later (much faster than re-indexing)
loaded = RAGAssistant.load_index("./my_index", provider=OllamaProvider())
results = loaded.retrieve("query")

Thread Safety

RAGAssistant is thread-safe. Multiple threads can safely read while another writes:

import threading

assistant = RAGAssistant("docs/", provider=OllamaProvider())

# Safe: concurrent reads and writes
threading.Thread(target=lambda: assistant.retrieve("query")).start()
threading.Thread(target=lambda: assistant.add_documents([new_doc])).start()

Resource Management

Use context managers for automatic cleanup:

from ragit.providers import OllamaProvider

with OllamaProvider() as provider:
    response = provider.generate("Hello", model="llama3")
# Session automatically closed

Document Loading

from ragit import load_text, load_directory, chunk_text

doc = load_text("file.md")
docs = load_directory("docs/", "*.md")
chunks = chunk_text(text, chunk_size=512, chunk_overlap=50, doc_id="id")

Hyperparameter Optimization

from ragit import RagitExperiment, Document, BenchmarkQuestion

def my_embed(text: str) -> list[float]:
    return embedding_vector

def my_generate(prompt: str, system_prompt: str = "") -> str:
    return llm_response

docs = [Document(id="1", content="...")]
benchmark = [BenchmarkQuestion(question="...", ground_truth="...")]

experiment = RagitExperiment(
    docs, benchmark,
    embed_fn=my_embed,
    generate_fn=my_generate
)
results = experiment.run(max_configs=20)
print(results[0])  # Best config

License

Apache-2.0 - RODMENA LIMITED

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
demos		demos
docs		docs
ragit		ragit
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.issue.db-journal		.issue.db-journal
.readthedocs.yaml		.readthedocs.yaml
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
contributing.md		contributing.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ragit

Installation

Quick Start

Custom Embedding Function

With LLM for Q&A

With Ollama (nomic-embed-text)

Core API

Index Persistence

Thread Safety

Resource Management

Document Loading

Hyperparameter Optimization

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

rodmena-limited/ragit

Folders and files

Latest commit

History

Repository files navigation

ragit

Installation

Quick Start

Custom Embedding Function

With LLM for Q&A

With Ollama (nomic-embed-text)

Core API

Index Persistence

Thread Safety

Resource Management

Document Loading

Hyperparameter Optimization

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages