Pain Point
When files are indexed via ctx_index(path), the index becomes stale silently if the file changes afterward. Users get outdated search results with no warning, leading to incorrect decisions based on old code/data.
Painful scenarios:
- Long coding sessions where source files change frequently
- Config files evolving during refactoring
git pull/rebase changing many files at once between sessions
- Current 14-day pruning is too coarse — a file indexed 1 hour ago can be completely wrong after a rebase
Strategy 1: File Watcher (real-time, mid-session)
Opt-in fs.watch on indexed file paths.
- On
ctx_index(path, source) → register path in watch list
- On file change (debounced 2-5s) → auto-reindex that source
- Config:
"autoReindex": true | false (default off)
- Ignore patterns:
node_modules, dist, .git
- Clean up watchers on session end
Pro: Zero-effort freshness — search always current
Con: Memory overhead from watchers (cap max watch count)
Strategy 2: Hash-based Stale Detection (lightweight, at search time)
SHA-256 hash check when returning search results.
- On
ctx_index(path) → store file hash + path in sources table
- On
ctx_search() → recompute hash for file-based results
- If hash differs → mark as
⚠️ stale or auto-refresh
ALTER TABLE sources ADD COLUMN content_hash TEXT;
ALTER TABLE sources ADD COLUMN file_path TEXT;
Example output:
⚠️ [stale] Title of chunk
Indexed: 2h ago | File modified: 5m ago
Pro: Near-zero overhead when unchanged (~1-5ms hash per file)
Con: Reactive — discover staleness only at search time
Strategy 3: Git-aware Batch Reindex (between sessions)
git diff-driven reindex on session start.
- Store
last_reindex_commit SHA in content DB
- On session start →
git diff --name-only <last>..HEAD
- Cross-reference with indexed source paths → batch reindex changed files only
- Handle deletes (remove from index) and renames (
git diff --name-status)
- Non-git fallback: mtime comparison
ctx_reindex # reindex git-changed files
ctx_reindex --all # force reindex everything
ctx_reindex --dry-run # preview what would reindex
Pro: Smart — only reindexes what changed
Con: Git-only without fallback, doesn't help mid-session
Synergy
| Strategy |
When it helps |
Cost |
| File watcher |
Real-time, mid-session |
Medium (watchers) |
| Hash check |
On-demand at search time |
Low (hash compare) |
| Git reindex |
Between sessions, after pull/rebase |
Low (one-time batch) |
All three together = complete freshness guarantee. Each can be implemented independently.
Suggested Implementation Order
- Hash check (smallest change, biggest trust improvement)
- Git reindex (covers between-session gap)
- File watcher (full real-time, opt-in)
Pain Point
When files are indexed via
ctx_index(path), the index becomes stale silently if the file changes afterward. Users get outdated search results with no warning, leading to incorrect decisions based on old code/data.Painful scenarios:
git pull/rebasechanging many files at once between sessionsStrategy 1: File Watcher (real-time, mid-session)
Opt-in
fs.watchon indexed file paths.ctx_index(path, source)→ register path in watch list"autoReindex": true | false(default off)node_modules,dist,.gitPro: Zero-effort freshness — search always current
Con: Memory overhead from watchers (cap max watch count)
Strategy 2: Hash-based Stale Detection (lightweight, at search time)
SHA-256 hash check when returning search results.
ctx_index(path)→ store file hash + path in sources tablectx_search()→ recompute hash for file-based results⚠️ staleor auto-refreshExample output:
Pro: Near-zero overhead when unchanged (~1-5ms hash per file)
Con: Reactive — discover staleness only at search time
Strategy 3: Git-aware Batch Reindex (between sessions)
git diff-driven reindex on session start.last_reindex_commitSHA in content DBgit diff --name-only <last>..HEADgit diff --name-status)Pro: Smart — only reindexes what changed
Con: Git-only without fallback, doesn't help mid-session
Synergy
All three together = complete freshness guarantee. Each can be implemented independently.
Suggested Implementation Order