Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 83 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,10 @@ Nova is an AI research and personal assistant written in Python that provides:
- Chat history saved to markdown files
- **Multi-provider AI integration** (OpenAI, Anthropic, Ollama)
- **Custom prompt templating system** with built-in templates and user-defined prompts
- **Enhanced web search with intelligent query optimization** using YAKE keyword extraction and semantic analysis
- Modular architecture for extensibility

**Current Status:** Phase 4 complete (Tools Integration), supports OpenAI, Anthropic, and Ollama with custom prompt templates and comprehensive tools system with profile-based configuration.
**Current Status:** Enhanced Search implemented with intelligent query optimization, supports OpenAI, Anthropic, and Ollama with custom prompt templates and comprehensive tools system with profile-based configuration.

## Package Management Commands

Expand Down Expand Up @@ -59,6 +60,87 @@ Use these commands:
- Override specific settings per profile (permission mode, enabled modules, etc.)
- Use "Global" or "Custom" tools configuration per profile

## Enhanced Web Search Commands

Nova includes intelligent search with context-aware query enhancement through both tools and chat commands.

### /search Command (Enhanced)

**Basic Usage:**
```bash
/search <query> # Uses your configured default enhancement
/s <query> # Short form
```

**Advanced Usage:**
```bash
/search <query> --enhancement fast # YAKE keyword extraction (~50ms)
/search <query> --enhancement semantic # KeyBERT semantic analysis (~200-500ms)
/search <query> --enhancement hybrid # Combined YAKE + KeyBERT (~300-600ms)
/search <query> --enhancement disabled # Direct search without enhancement

/search <query> --provider google --max 3 # Existing options still work
/search <query> --technical-level expert # Adjust query complexity
/search <query> --timeframe recent # Prefer recent results
```

### Tool Usage (Alternative)

```bash
/tool web_search query="Python async programming" enhancement="fast"
/tool web_search query="machine learning deployment" enhancement="semantic"
```

### Search Enhancement Modes

- **auto**: Automatically choose best enhancement (YAKE + context)
- **disabled**: No enhancement, direct search
- **fast**: YAKE-only enhancement (~50ms) - **Default**
- **semantic**: KeyBERT semantic analysis (~200-500ms, requires additional dependencies)
- **hybrid**: Combined YAKE + KeyBERT (~300-600ms)

### Search Enhancement Configuration

Configure default search behavior in your configuration file:

```yaml
search:
# Basic search settings
enabled: true
default_provider: "duckduckgo"
max_results: 5
use_ai_answers: true

# Enhancement defaults (users can override per search)
default_enhancement: "fast" # auto, disabled, fast, semantic, hybrid
enable_conversation_context: true # Use chat history for context
default_technical_level: "intermediate" # beginner, intermediate, expert
default_timeframe: "any" # recent, past_year, any

# Performance settings
performance_mode: true # Prioritize speed over accuracy
enhancement_timeout: 30.0 # Query enhancement timeout (seconds)
request_timeout: 10.0 # HTTP request timeout (seconds)

# Advanced: Enable semantic analysis (optional)
enable_keybert: false # Set to true for KeyBERT
extraction_backend: "yake_only" # yake_only, keybert_only, hybrid
```

### Performance Guidance

- Use **fast** or default for most queries (optimal speed/accuracy balance)
- Use **semantic** for complex technical topics or research (requires: `uv add keybert sentence-transformers`)
- Use **disabled** for exact phrase searches or when speed is critical
- The system automatically uses conversation context to improve search relevance

### Timeout Configuration

- **enhancement_timeout**: Controls how long the AI-powered query enhancement phase can take before falling back to the original query (default: 30 seconds)
- **request_timeout**: Sets the HTTP timeout for individual search engine requests (default: 10 seconds)
- If query enhancement times out, the search will continue with the original query and display a warning
- Increase `enhancement_timeout` if you have slow AI responses but want more thorough enhancement

## Testing Commands

- Run all tests: `uv run pytest`
Expand Down
98 changes: 98 additions & 0 deletions SEARCH_ENHANCEMENT_SIMPLIFICATION_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Search Enhancement Simplification Plan

## Current Problems - Brittleness Analysis

### 1. **Over-Complex Pipeline Architecture**
- Three-stage pipeline: NLP extraction → LLM planning → JSON execution
- Each stage can fail independently, requiring complex error handling
- Far too complex for basic search query enhancement

### 2. **Multiple Fragile Dependencies**
- spaCy (requires model download: `en_core_web_sm`)
- YAKE keyword extraction
- KeyBERT + sentence-transformers (optional but complex)
- Multiple fallback chains when dependencies fail

### 3. **Brittle LLM JSON Parsing**
- Relies on LLM returning perfect JSON format
- Complex parsing logic to handle markdown code blocks
- Single malformed response breaks the entire enhancement
- JSON schema validation adds unnecessary complexity

### 4. **Configuration Explosion**
- 6 enhancement modes (auto, disabled, fast, semantic, hybrid, adaptive)
- 4 extraction backends (yake_only, keybert_only, hybrid, adaptive)
- Technical levels, timeframes, performance modes, timeout configs
- Each combination can behave differently and fail in unique ways

### 5. **Cascading Failure Points**
```
spaCy fails → KeyBERT fails → AI client fails → LLM timeouts → JSON parsing fails
```
Each failure requires its own fallback logic, creating maintenance nightmare

### 6. **Heavy Resource Overhead**
- Loading multiple ML models on startup
- Complex caching mechanisms
- Multiple AI API calls per search

## Proposed Simpler, More Resilient Design

### **Single-Step Approach**
Replace entire pipeline with:
1. Simple prompt: "Suggest 2-3 alternative search queries for: {original_query}"
2. Parse response as plain text (not JSON)
3. Use original query if anything fails

### **Eliminate Dependencies**
- Remove spaCy, YAKE, KeyBERT entirely
- Use basic string processing if keyword extraction needed
- Let the AI handle all intelligence

### **Two-Mode Configuration**
- `enhanced`: Use AI to suggest alternatives (default)
- `disabled`: Use original query only
- Remove all other complexity

### **Robust Fallback**
```python
try:
enhanced_queries = await simple_ai_enhance(query)
except:
enhanced_queries = [query] # Always fallback to original
```

## Key Benefits of Simplified Approach

1. **Massive reduction in complexity** - 90% less code
2. **Fewer failure points** - Single try/catch vs cascading failures
3. **No external ML dependencies** - Just use existing AI client
4. **Easier to debug and maintain**
5. **More predictable behavior**
6. **Faster startup time** - No model loading
7. **Better resource usage** - No background ML processes

## Implementation Strategy

1. Create new simple enhancement module alongside existing one
2. Add feature flag to switch between old/new systems
3. Test new system thoroughly
4. Gradually migrate users to new system
5. Remove old complex system once proven

## Files to Modify/Remove

**Remove entirely:**
- `nova/search/enhancement/extractors.py`
- `nova/search/enhancement/classifier.py`
- Most of `nova/search/enhancement/enhancer.py`

**Simplify:**
- `nova/search/models.py` - Remove complex models
- `nova/tools/built_in/web_search.py` - Simplify parameters
- Configuration - Reduce options to just `enhanced`/`disabled`

**Create:**
- `nova/search/enhancement/simple_enhancer.py` - New minimal implementation

The current system is a classic over-engineering case - the complexity far exceeds the value delivered. A much simpler approach would be more reliable, maintainable, and actually more resilient.
52 changes: 47 additions & 5 deletions config/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,28 +15,28 @@ profiles:
max_tokens: 2000
temperature: 0.7
# api_key will be set via environment variables

gpt4:
name: "gpt4"
provider: "openai"
model_name: "gpt-4"
max_tokens: 4000
temperature: 0.7

claude:
name: "claude"
provider: "anthropic"
model_name: "claude-sonnet-4-20250514"
max_tokens: 4000
temperature: 0.7

claude-opus:
name: "claude-opus"
provider: "anthropic"
model_name: "claude-opus-4-20250514"
max_tokens: 4000
temperature: 0.7

llama:
name: "llama"
provider: "ollama"
Expand All @@ -48,6 +48,44 @@ profiles:
# Active profile (defaults to "default" if not specified)
active_profile: "default"

# Enhanced Web Search Configuration
search:
enabled: true
default_provider: "duckduckgo"
max_results: 5
use_ai_answers: true

# Enhancement Configuration
default_enhancement: "fast" # auto, disabled, fast, semantic, hybrid
enable_conversation_context: true # Use chat history for context
context_messages_count: 5 # Number of recent messages to use
default_technical_level: "intermediate" # beginner, intermediate, expert
default_timeframe: "any" # recent, past_year, any

# Performance Settings
performance_mode: true # Prioritize speed over accuracy
enhancement_timeout: 30.0 # Query enhancement timeout (seconds)
request_timeout: 10.0 # HTTP request timeout (seconds)

# Keyword Extraction Configuration
extraction_backend: "yake_only" # yake_only, keybert_only, hybrid, adaptive
enable_keybert: false # Set to true for KeyBERT (requires optional deps)
yake_max_keywords: 10
keybert_max_keywords: 6
keybert_model: "all-MiniLM-L6-v2"

# Provider API configurations
google: {} # Add api_key and search_engine_id
bing: {} # Add api_key

# Tools configuration
tools:
enabled: true
enabled_built_in_modules: ["file_ops", "web_search", "conversation"]
permission_mode: "prompt"
execution_timeout: 30
max_concurrent_tools: 3

# Memory Management Features:
# - Conversation summarization for long chats (/summarize command)
# - Smart context optimization with token limit awareness
Expand All @@ -61,4 +99,8 @@ active_profile: "default"
# - OPENAI_API_KEY: OpenAI-specific API key
# - ANTHROPIC_API_KEY: Anthropic-specific API key
# - NOVA_PROFILE: Override active profile
# - OLLAMA_HOST: Ollama server URL override
# - OLLAMA_HOST: Ollama server URL override

# For Enhanced Search with KeyBERT (optional):
# Install semantic search dependencies: uv add --optional search-semantic
# Set search.enable_keybert: true and search.extraction_backend: "hybrid"
Loading
Loading