Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
154 changes: 154 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,157 @@ Process repeats until successful or max attempts reached

---

## 📊 Adding to the Vector Database

The system uses vector embeddings to find similar projects and error examples, which helps improve code generation quality. Here's how to add your own examples:

### 🔧 Creating Vector Collections

First, you need to create the necessary collections in Qdrant using these curl commands:

```bash
# Create project_examples collection with 1536 dimensions (default)
curl -X PUT "http://localhost:6333/collections/project_examples" \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 1536,
"distance": "Cosine"
}
}'

# Create error_examples collection with 1536 dimensions (default)
curl -X PUT "http://localhost:6333/collections/error_examples" \
-H "Content-Type: application/json" \
-d '{
"vectors": {
"size": 1536,
"distance": "Cosine"
}
}'
```
Note: If you've configured a different embedding size via ```LLM_EMBED_SIZE``` environment variable, replace 1536 with that value.

### Method 1: Using Python API Directly

#### For Project Examples
```python
from app.llm_client import LlamaEdgeClient
from app.vector_store import QdrantStore

# Initialize the components
llm_client = LlamaEdgeClient()
vector_store = QdrantStore()

# Ensure collection exists
vector_store.create_collection("project_examples")

# 1. Prepare your data
project_data = {
"query": "A command-line calculator in Rust",
"example": "Your full project example with code here...",
"project_files": {
"src/main.rs": "fn main() {\n println!(\"Hello, calculator!\");\n}",
"Cargo.toml": "[package]\nname = \"calculator\"\nversion = \"0.1.0\"\nedition = \"2021\"\n\n[dependencies]"
}
}

# 2. Get embedding for the query text
embedding = llm_client.get_embeddings([project_data["query"]])[0]

# 3. Add to vector database
vector_store.add_item(
collection_name="project_examples",
vector=embedding,
item=project_data
)
```

For Error Examples:
```python
from app.llm_client import LlamaEdgeClient
from app.vector_store import QdrantStore

# Initialize the components
llm_client = LlamaEdgeClient()
vector_store = QdrantStore()

# Ensure collection exists
vector_store.create_collection("error_examples")

# 1. Prepare your error data
error_data = {
"error": "error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable",
"solution": "Ensure mutable and immutable borrows don't overlap by using separate scopes",
"context": "This error occurs when you try to borrow a value mutably while an immutable borrow exists",
"example": "// Before (error)\nfn main() {\n let mut v = vec![1, 2, 3];\n let first = &v[0];\n v.push(4); // Error: cannot borrow `v` as mutable\n println!(\"{}\", first);\n}\n\n// After (fixed)\nfn main() {\n let mut v = vec![1, 2, 3];\n {\n let first = &v[0];\n println!(\"{}\", first);\n } // immutable borrow ends here\n v.push(4); // Now it's safe to borrow mutably\n}"
}

# 2. Get embedding for the error message
embedding = llm_client.get_embeddings([error_data["error"]])[0]

# 3. Add to vector database
vector_store.add_item(
collection_name="error_examples",
vector=embedding,
item=error_data
)
```

### Method 2: Adding Multiple Examples from JSON Files
Place JSON files in the appropriate directories:

Project examples: ```project_examples```
Error examples: ```error_examples```
Format for project examples (with optional project_files field):
```json
{
"query": "Description of the project",
"example": "Full example code or description",
"project_files": {
"src/main.rs": "// File content here",
"Cargo.toml": "// File content here"
}
}
```
Format for error examples:
```
{
"error": "Rust compiler error message",
"solution": "How to fix the error",
"context": "Additional explanation (optional)",
"example": "// Code example showing the fix (optional)"
}
```
Then run the data loading script:
```
python -c "from app.load_data import load_project_examples, load_error_examples; load_project_examples(); load_error_examples()"
```

### Method 3: Using the ```parse_and_save_qna.py``` Script
For bulk importing from a Q&A format text file:

Place your Q&A pairs in a text file with format similar to ```QnA_pair.txt```
Modify the ```parse_and_save_qna.py``` script to point to your file
Run the script:
```
python parse_and_save_qna.py
```

## ⚙️ Environment Variables for Vector Search
The SKIP_VECTOR_SEARCH environment variable controls whether the system uses vector search:

```SKIP_VECTOR_SEARCH```=true - Disables vector search functionality
```SKIP_VECTOR_SEARCH```=false (or not set) - Enables vector search
In your current .env file, you have:
```
SKIP_VECTOR_SEARCH=true
```
This means vector search is currently disabled. To enable it:
- Change this value to false or remove the line completely
- Ensure you have a running Qdrant instance (via Docker Compose or standalone)
- Create the collections as shown above

## 🤝 Contributing
Contributions are welcome! This project uses the Developer Certificate of Origin (DCO) to certify that contributors have the right to submit their code. Follow these steps:

Expand All @@ -458,3 +609,6 @@ This certifies that you wrote or have the right to submit the code you're contri
## 📜 License
Licensed under [GPLv3](https://www.gnu.org/licenses/gpl-3.0.en.html).