Ragbot follows a fundamental principle in software engineering: separation of concerns. Your personal data should never be mixed with application code. This isn't just a best practice—it's essential for privacy, security, and flexibility.
Ragbot uses a multi-repository architecture where each context (personal, company, client) has its own repository:
ragbot/ # Application code (public)
├── src/ragbot/ # Core library
├── web/ # React frontend
└── api/ # FastAPI backend
ai-knowledge-{name}/ # Content repositories (private)
├── source/ # Human-edited content (authoritative)
│ ├── instructions/ # WHO - identity and persona
│ ├── runbooks/ # HOW - task procedures
│ └── datasets/ # WHAT - reference knowledge
├── compiled/ # Auto-generated output
│ └── {project}/
│ └── instructions/ # LLM-specific instructions (local compilation)
├── all-knowledge.md # Concatenated knowledge (CI/CD via GitHub Actions)
└── compile-config.yaml # Compilation settings
AI knowledge is organized into three conceptual categories:
| Folder | Purpose | Question Answered |
|---|---|---|
instructions/ |
Identity and persona | WHO is the agent? |
runbooks/ |
Task procedures | HOW does the agent do things? |
datasets/ |
Reference knowledge | WHAT does the agent know? |
System-level behavioral guidance that defines the agent's identity:
- Communication style and tone
- Core principles and values
- Response preferences
Task-specific procedures for autonomous AI execution:
- Content creation guidelines
- Automation workflows
- Prompting techniques
Reference knowledge organized by category:
- Personal information
- Professional background
- Domain expertise
AI Knowledge repos follow a hierarchy with inheritance:
ai-knowledge-{templates} ← Public templates (root)
↓
ai-knowledge-{person} ← Personal identity
↓
ai-knowledge-{company} ← Company knowledge
↓
ai-knowledge-{client} ← Client-specific content
Each child repo inherits content from its parent. This enables:
- Layered identity: Personal context + company context + client context
- Privacy control: Each repo only contains content appropriate for its access level
- Flexible compilation: Compile with or without inheritance
Just as your operating system (macOS, Linux, Windows) separates:
- System files (the OS itself) from user files (your documents)
- Applications (software) from data (what you create)
- Configuration (settings) from secrets (passwords)
Ragbot separates:
- Application code (
ragbot/) from your knowledge (ai-knowledge-*/) - The AI engine from your context and identity
- Generic examples from personal information
The Library Analogy:
- Ragbot is the librarian (constant, helpful, knowledgeable about systems)
- Your ai-knowledge repos are the books on the shelves (your unique knowledge)
- Instructions are how you want the librarian to help you (your preferences)
The Assistant Analogy:
- Ragbot is your assistant (the person with skills and tools)
- Your ai-knowledge content is the briefing materials (context about your life/work)
- Instructions are the working relationship (how you collaborate)
What stays private:
- Your personal information (in private ai-knowledge repos)
- Work/client data (in separate repos per client)
- Your AI instructions (your "secret sauce")
What's public:
- The application code (open source)
- Generic examples and templates
- Prompting techniques and frameworks
Multiple Contexts:
- Personal repo for personal use
- Company repo for work projects
- Client repos for client-specific work
- Each compiled independently or with inheritance
Version Control:
- Update Ragbot code without affecting your knowledge
- Rollback knowledge changes independently
- Branch knowledge for experiments
- Share repos selectively
Your knowledge travels with you:
- Same repos work on any machine
- Easy backup (git push to remote)
- Migrate to new machine (git clone)
- Share setup without sharing content
You can share:
- The application (public ragbot repo)
- Generic templates (examples directory)
- Anonymized techniques
You keep private:
- Personal data
- Client information
- Your customizations
ragbot/
├── src/
├── my-personal-data/ # DANGER: Easy to accidentally commit
├── my-instructions/
└── client-secrets/ # DANGER: Might leak
Problems:
- High risk of committing sensitive data
- Can't share code without exposing data
- One .gitignore mistake = privacy breach
ragbot/
└── src/ # No customization capability
Problems:
- AI has no context about you
- Repeat yourself in every conversation
- Generic, not personalized responses
ragbot/ (public) ai-knowledge-*/ (private)
├── src/ ├── source/
├── web/ │ ├── instructions/
├── api/ │ ├── runbooks/
└── examples/ │ └── datasets/
├── compiled/
└── compile-config.yaml
Benefits:
- Clear separation of concerns
- Privacy by design
- Flexible inheritance model
- Easy to share application, not data
Quick Start:
# 1. Clone Ragbot
git clone https://github.com/synthesisengineering/ragbot.git
cd ragbot
pip install -e .
# 2. Create your personal ai-knowledge repo
mkdir -p ~/ai-knowledge/ai-knowledge-personal
cd ~/ai-knowledge/ai-knowledge-personal
mkdir -p source/instructions source/runbooks source/datasets
# 3. Add your content
# Edit files in source/
# 4. Compile
ragbot compile --repo ~/ai-knowledge/ai-knowledge-personal
# 5. Chat
ragbot chat --workspace personalMultiple Contexts with Inheritance:
# my-projects.yaml in your personal repo
version: 1
projects:
personal:
local_path: ~/ai-knowledge/ai-knowledge-personal
inherits_from: []
company:
local_path: ~/ai-knowledge/ai-knowledge-company
inherits_from:
- personal
client-a:
local_path: ~/ai-knowledge/ai-knowledge-client-a
inherits_from:
- companyCompile with inheritance:
ragbot compile --project client-a --with-inheritance-
Keep repos separate - One repo per context (personal, company, client)
-
Use inheritance wisely - Personal → Company → Client hierarchy
-
Version control your knowledge - Git provides history and backup
-
Use the WHO/HOW/WHAT structure
source/ ├── instructions/ # WHO ├── runbooks/ # HOW └── datasets/ # WHAT -
Edit source/ directly - Knowledge concatenation is automatic via CI/CD
-
Don't commit secrets - No API keys in content files
-
Don't mix public and private - Keep ragbot/ and ai-knowledge-*/ separate
-
Don't skip CI/CD - all-knowledge.md is auto-generated by GitHub Actions
Ragbot's separation of code and data follows proven patterns from:
- Unix dotfiles
- Infrastructure as Code
- Twelve-Factor App methodology
- Security best practices
The result:
- Privacy by design
- Flexibility for multiple contexts
- Inheritance for layered identity
- Easy to update and maintain
Bottom line: Your knowledge is yours. The application is shared. This separation keeps both better.
- Compilation Guide - How the compiler works
- Project Documentation Convention - Project folder structure
- The Twelve-Factor App - Configuration principles