Skip to content

Commit 5cb1ae8

Browse files
authored
Merge pull request #15 from Acuspeedster/main
2 parents c2f78a2 + fe723d9 commit 5cb1ae8

File tree

1 file changed

+154
-0
lines changed

1 file changed

+154
-0
lines changed

README.md

Lines changed: 154 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -436,6 +436,157 @@ Process repeats until successful or max attempts reached
436436

437437
---
438438

439+
## 📊 Adding to the Vector Database
440+
441+
The system uses vector embeddings to find similar projects and error examples, which helps improve code generation quality. Here's how to add your own examples:
442+
443+
### 🔧 Creating Vector Collections
444+
445+
First, you need to create the necessary collections in Qdrant using these curl commands:
446+
447+
```bash
448+
# Create project_examples collection with 1536 dimensions (default)
449+
curl -X PUT "http://localhost:6333/collections/project_examples" \
450+
-H "Content-Type: application/json" \
451+
-d '{
452+
"vectors": {
453+
"size": 1536,
454+
"distance": "Cosine"
455+
}
456+
}'
457+
458+
# Create error_examples collection with 1536 dimensions (default)
459+
curl -X PUT "http://localhost:6333/collections/error_examples" \
460+
-H "Content-Type: application/json" \
461+
-d '{
462+
"vectors": {
463+
"size": 1536,
464+
"distance": "Cosine"
465+
}
466+
}'
467+
```
468+
Note: If you've configured a different embedding size via ```LLM_EMBED_SIZE``` environment variable, replace 1536 with that value.
469+
470+
### Method 1: Using Python API Directly
471+
472+
#### For Project Examples
473+
```python
474+
from app.llm_client import LlamaEdgeClient
475+
from app.vector_store import QdrantStore
476+
477+
# Initialize the components
478+
llm_client = LlamaEdgeClient()
479+
vector_store = QdrantStore()
480+
481+
# Ensure collection exists
482+
vector_store.create_collection("project_examples")
483+
484+
# 1. Prepare your data
485+
project_data = {
486+
"query": "A command-line calculator in Rust",
487+
"example": "Your full project example with code here...",
488+
"project_files": {
489+
"src/main.rs": "fn main() {\n println!(\"Hello, calculator!\");\n}",
490+
"Cargo.toml": "[package]\nname = \"calculator\"\nversion = \"0.1.0\"\nedition = \"2021\"\n\n[dependencies]"
491+
}
492+
}
493+
494+
# 2. Get embedding for the query text
495+
embedding = llm_client.get_embeddings([project_data["query"]])[0]
496+
497+
# 3. Add to vector database
498+
vector_store.add_item(
499+
collection_name="project_examples",
500+
vector=embedding,
501+
item=project_data
502+
)
503+
```
504+
505+
For Error Examples:
506+
```python
507+
from app.llm_client import LlamaEdgeClient
508+
from app.vector_store import QdrantStore
509+
510+
# Initialize the components
511+
llm_client = LlamaEdgeClient()
512+
vector_store = QdrantStore()
513+
514+
# Ensure collection exists
515+
vector_store.create_collection("error_examples")
516+
517+
# 1. Prepare your error data
518+
error_data = {
519+
"error": "error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable",
520+
"solution": "Ensure mutable and immutable borrows don't overlap by using separate scopes",
521+
"context": "This error occurs when you try to borrow a value mutably while an immutable borrow exists",
522+
"example": "// Before (error)\nfn main() {\n let mut v = vec![1, 2, 3];\n let first = &v[0];\n v.push(4); // Error: cannot borrow `v` as mutable\n println!(\"{}\", first);\n}\n\n// After (fixed)\nfn main() {\n let mut v = vec![1, 2, 3];\n {\n let first = &v[0];\n println!(\"{}\", first);\n } // immutable borrow ends here\n v.push(4); // Now it's safe to borrow mutably\n}"
523+
}
524+
525+
# 2. Get embedding for the error message
526+
embedding = llm_client.get_embeddings([error_data["error"]])[0]
527+
528+
# 3. Add to vector database
529+
vector_store.add_item(
530+
collection_name="error_examples",
531+
vector=embedding,
532+
item=error_data
533+
)
534+
```
535+
536+
### Method 2: Adding Multiple Examples from JSON Files
537+
Place JSON files in the appropriate directories:
538+
539+
Project examples: ```project_examples```
540+
Error examples: ```error_examples```
541+
Format for project examples (with optional project_files field):
542+
```json
543+
{
544+
"query": "Description of the project",
545+
"example": "Full example code or description",
546+
"project_files": {
547+
"src/main.rs": "// File content here",
548+
"Cargo.toml": "// File content here"
549+
}
550+
}
551+
```
552+
Format for error examples:
553+
```
554+
{
555+
"error": "Rust compiler error message",
556+
"solution": "How to fix the error",
557+
"context": "Additional explanation (optional)",
558+
"example": "// Code example showing the fix (optional)"
559+
}
560+
```
561+
Then run the data loading script:
562+
```
563+
python -c "from app.load_data import load_project_examples, load_error_examples; load_project_examples(); load_error_examples()"
564+
```
565+
566+
### Method 3: Using the ```parse_and_save_qna.py``` Script
567+
For bulk importing from a Q&A format text file:
568+
569+
Place your Q&A pairs in a text file with format similar to ```QnA_pair.txt```
570+
Modify the ```parse_and_save_qna.py``` script to point to your file
571+
Run the script:
572+
```
573+
python parse_and_save_qna.py
574+
```
575+
576+
## ⚙️ Environment Variables for Vector Search
577+
The SKIP_VECTOR_SEARCH environment variable controls whether the system uses vector search:
578+
579+
```SKIP_VECTOR_SEARCH```=true - Disables vector search functionality
580+
```SKIP_VECTOR_SEARCH```=false (or not set) - Enables vector search
581+
In your current .env file, you have:
582+
```
583+
SKIP_VECTOR_SEARCH=true
584+
```
585+
This means vector search is currently disabled. To enable it:
586+
- Change this value to false or remove the line completely
587+
- Ensure you have a running Qdrant instance (via Docker Compose or standalone)
588+
- Create the collections as shown above
589+
439590
## 🤝 Contributing
440591
Contributions are welcome! This project uses the Developer Certificate of Origin (DCO) to certify that contributors have the right to submit their code. Follow these steps:
441592

@@ -458,3 +609,6 @@ This certifies that you wrote or have the right to submit the code you're contri
458609
## 📜 License
459610
Licensed under [GPLv3](https://www.gnu.org/licenses/gpl-3.0.en.html).
460611

612+
613+
614+

0 commit comments

Comments
 (0)