@@ -88,7 +90,7 @@ memesh
```bash
memesh-mcp
```
-MCP protocol (auto-configured)
+MCP tools + Claude Code hooks
|
@@ -96,6 +98,7 @@ MCP protocol (auto-configured)
**Any HTTP Client**
```bash
curl localhost:3737/v1/recall \
+ -H "Content-Type: application/json" \
-d '{"query":"auth"}'
```
`memesh serve` (REST API)
@@ -116,29 +119,30 @@ Paste tools into any API call
---
-## Why Not Just Use Mem0 / Zep?
+## Why Not OpenMemory, Cursor Memories, Mem0, Or Zep?
-| | **MeMesh** | Mem0 | Zep |
-|---|---|---|---|
-| **Install time** | 5 seconds | 30-60 minutes | 30+ minutes |
-| **Setup** | `npm i -g` — done | Neo4j + VectorDB + API keys | Neo4j + config |
-| **Storage** | Single SQLite file | Neo4j + Qdrant | Neo4j |
-| **Works offline** | Yes, always | No | No |
-| **Dashboard** | Built-in (7 tabs + analytics) | None | None |
-| **Dependencies** | 6 | 20+ | 10+ |
-| **Price** | Free forever | Free tier / Paid | Free tier / Paid |
+| | **MeMesh** | OpenMemory | Cursor Memories | Mem0 | Zep / Graphiti |
+|---|---|---|---|---|---|
+| **Best fit** | Local memory for coding agents | Local/cross-client MCP memory | Cursor-native project memory | Managed app/agent memory | Temporal knowledge graphs |
+| **Install shape** | `npm install -g @pcircle/memesh` | Local app/server flow | Built into Cursor | Cloud API / SDK / MCP | Service/framework setup |
+| **Storage** | One local SQLite file | Local memory stack | Cursor-managed rules/memories | Hosted or self-hosted stack | Graph database |
+| **Cloud required** | No | No for local mode | Depends on Cursor account/settings | Yes for platform | Usually yes/self-hosted |
+| **Claude Code hooks** | First-class | MCP tools | No | MCP tools | Not Claude Code-specific |
+| **Dashboard** | Built in | Built in | Cursor settings | Platform dashboard | Platform/graph tooling |
+| **Tradeoff** | Simple local wedge, not enterprise scale | Broader local app footprint | Locked to Cursor | Strong managed platform, less local-first | Strong graph model, heavier setup |
-**MeMesh trades:** enterprise-scale multi-tenant features for **instant setup, zero infrastructure, and 100% privacy**.
+**MeMesh trades enterprise-scale managed infrastructure for instant local setup, inspectable storage, and coding-agent workflow hooks.**
---
-## What Happens Automatically
+## What Happens Automatically In Claude Code
-You don't need to manually remember everything. MeMesh has **4 hooks** that capture knowledge without you doing anything:
+You don't need to manually remember everything. MeMesh has **5 hooks** that capture and inject knowledge while you work:
| When | What MeMesh does |
|------|------------------|
| **Every session start** | Loads your most relevant memories + proactive warnings from past lessons |
+| **Before editing files** | Recalls memories tied to the file or project before Claude writes code |
| **After every `git commit`** | Records what you changed, with diff stats |
| **When Claude stops** | Captures files edited, errors fixed, and auto-generates structured lessons from failures |
| **Before context compaction** | Saves knowledge before it's lost to context limits |
@@ -167,7 +171,7 @@ You don't need to manually remember everything. MeMesh has **4 hooks** that capt
**🧠 Smart Search** — Search "login security" and find memories about "OAuth PKCE". MeMesh expands queries with related terms using your configured LLM.
-**📊 Scored Ranking** — Results ranked by relevance (35%) + how recently you used it (25%) + how often (20%) + confidence (15%) + whether the info is still current (5%).
+**📊 Scored Ranking** — Results ranked by relevance (30%) + recency (25%) + frequency (15%) + confidence (15%) + recall impact (10%) + temporal validity (5%).
**🔄 Knowledge Evolution** — Decisions change. `forget` archives old memories (never deletes). `supersedes` relations link old → new. Your AI always sees the latest version.
@@ -177,7 +181,7 @@ You don't need to manually remember everything. MeMesh has **4 hooks** that capt
---
-## Real-World Usage
+## Example Usage
> "MeMesh remembered that we chose PKCE over implicit flow three weeks ago. When I asked Claude about auth again, it already knew — no re-explaining needed."
> — **Solo developer, building a SaaS**
@@ -192,7 +196,7 @@ You don't need to manually remember everything. MeMesh has **4 hooks** that capt
## Unlock Smart Mode (Optional)
-MeMesh works fully offline out of the box. Add an LLM API key to unlock smarter search:
+MeMesh works offline by default. Add an LLM API key only if you want query expansion, smarter extraction, and compression:
```bash
memesh config set llm.provider anthropic
@@ -255,7 +259,7 @@ Core is framework-agnostic. Same logic runs from terminal, HTTP, or MCP.
```bash
git clone https://github.com/PCIRCLE-AI/memesh-llm-memory
cd memesh-llm-memory && npm install && npm run build
-npm test -- --run # 413 tests
+npm test # 445 tests
```
Dashboard: `cd dashboard && npm install && npm run dev`
diff --git a/docs/platforms/README.md b/docs/platforms/README.md
new file mode 100644
index 00000000..c57b3ecf
--- /dev/null
+++ b/docs/platforms/README.md
@@ -0,0 +1,156 @@
+# MeMesh Integration Guide
+
+MeMesh is designed for local coding-agent memory first, with portable integration through MCP, HTTP, and CLI modes. Choose the mode that matches your client.
+
+---
+
+## 🎯 Quick Platform Guide
+
+| Client | Best Mode | Setup | Guide |
+|--------|-----------|-------|-------|
+| **Claude Code / Claude Desktop** | MCP Server | Add `memesh-mcp` to your MCP config | See root [README](../../README.md) |
+| **MCP-compatible coding agents** | MCP Server | Point the client at `memesh-mcp` | See root [README](../../README.md) |
+| **Custom apps / scripts** | HTTP API | Run `memesh serve` and call `/v1/*` | [universal.md](./universal.md) |
+| **ChatGPT / Custom GPT experiments** | HTTP API | Use a local connector/proxy that can reach localhost | [chatgpt.md](./chatgpt.md) |
+| **Google Gemini experiments** | HTTP API | Use a local connector/proxy that can reach localhost | [gemini.md](./gemini.md) |
+
+---
+
+## 📊 Integration Modes Comparison
+
+### 🟢 HTTP API Mode (Universal)
+**Best for**: custom apps, scripts, local tools, and AI clients that can make HTTP requests to localhost
+
+**Pros**:
+- Works with ALL AI platforms
+- No special client support needed
+- Easy to test manually with curl
+
+**Cons**:
+- Requires server to be running
+- Need to paste system prompt into AI settings
+
+**Setup**:
+```bash
+npm install -g @pcircle/memesh
+memesh serve --port 3737
+curl http://localhost:3737/v1/health
+```
+
+---
+
+### 🟡 MCP Server Mode (Native)
+**Best for**: Claude Code, Cursor (if MCP-enabled)
+
+**Pros**:
+- Native tool integration (cleanest UX)
+- Structured inputs/outputs
+- Auto-discovery of capabilities
+
+**Cons**:
+- Only works with MCP-compatible clients
+- Requires MCP config setup
+
+**Setup**:
+```bash
+npm install -g @pcircle/memesh
+memesh-mcp
+# Add this command to your MCP client's server config.
+```
+
+---
+
+### 🔴 CLI Mode (Advanced)
+**Best for**: Terminal-based workflows, scripting, CI/CD
+
+**Pros**:
+- Works without server
+- Can be scripted
+- Direct database access
+
+**Cons**:
+- Requires AI to invoke shell commands
+- Less interactive
+
+**Setup**:
+```bash
+npm install -g @pcircle/memesh
+memesh remember --name "test" --type note --obs "Hello"
+memesh recall "test"
+```
+
+---
+
+## 🚀 Quick Start (Any Platform)
+
+### 1. Install MeMesh
+```bash
+npm install -g @pcircle/memesh
+```
+
+### 2. Start the server
+```bash
+memesh serve
+# Server running at http://localhost:3737
+# Dashboard at http://localhost:3737/dashboard
+```
+
+### 3. Test the HTTP API
+```bash
+curl http://localhost:3737/v1/health
+curl -X POST http://localhost:3737/v1/recall \
+ -H 'Content-Type: application/json' \
+ -d '{"query":"test"}'
+```
+
+### 4. Connect your client
+Use MCP mode when the client supports MCP. Use HTTP mode when you control a local app, script, or connector that can call `localhost`.
+
+---
+
+## 📚 Platform-Specific Guides
+
+- **[ChatGPT / Custom GPTs](./chatgpt.md)** - HTTP API with custom instructions
+- **[Google Gemini](./gemini.md)** - HTTP API with system instructions
+- **[Universal Guide](./universal.md)** - For any other AI platform
+
+---
+
+## 🔍 How to Choose
+
+**Use MCP Mode if**:
+- Your platform explicitly supports MCP (Model Context Protocol)
+- You want the cleanest, most native experience
+- You're using Claude Code or Cursor
+
+**Use HTTP API Mode if**:
+- Your platform is ChatGPT, Gemini, Ollama, or any web-based AI
+- You want maximum compatibility
+- You're okay with copy-pasting system instructions
+
+**Use CLI Mode if**:
+- You're building scripts or automation
+- You need direct database access
+- You're integrating MeMesh into CI/CD
+
+---
+
+## 🛠️ Troubleshooting
+
+**"Connection refused" error**:
+- Make sure `memesh serve` is running
+- Check the port (default: 3737)
+- Try `curl http://localhost:3737/v1/health`
+
+**"No memories found"**:
+- Create a test memory: `memesh remember --name test --type note --obs "Hello"`
+- Check dashboard: http://localhost:3737/dashboard
+
+**MCP client not seeing tools**:
+- Verify the client is configured to run `memesh-mcp`
+- Run `memesh status` to confirm local database and capabilities
+- Check the client logs for MCP server startup errors
+
+---
+
+**Need help?** Open an issue: https://github.com/PCIRCLE-AI/memesh-llm-memory/issues
diff --git a/docs/platforms/chatgpt.md b/docs/platforms/chatgpt.md
new file mode 100644
index 00000000..178024fb
--- /dev/null
+++ b/docs/platforms/chatgpt.md
@@ -0,0 +1,95 @@
+# MeMesh With ChatGPT / Custom GPTs
+
+ChatGPT can use MeMesh only when you provide a connector, action, proxy, or local bridge that can call your MeMesh HTTP server. Custom instructions alone cannot make ChatGPT call `localhost`.
+
+## Supported Shape
+
+Use this guide when you control one of these:
+
+- A local app or script that calls both ChatGPT and MeMesh
+- A Custom GPT Action exposed through a reachable HTTPS endpoint
+- A private proxy/tunnel that forwards approved requests to local MeMesh
+
+If you only use the normal ChatGPT web UI with custom instructions, MeMesh can still be used manually through the CLI, but ChatGPT will not call the API by itself.
+
+## Start MeMesh
+
+```bash
+npm install -g @pcircle/memesh
+memesh serve
+```
+
+Default endpoints:
+
+```text
+API: http://localhost:3737/v1
+Dashboard: http://localhost:3737/dashboard
+```
+
+Verify:
+
+```bash
+curl http://localhost:3737/v1/health
+```
+
+## HTTP Operations
+
+Remember:
+
+```bash
+curl -X POST http://localhost:3737/v1/remember \
+ -H "Content-Type: application/json" \
+ -d '{
+ "name": "database-decision",
+ "type": "decision",
+ "observations": ["Use PostgreSQL for ACID transactions"],
+ "tags": ["project:myapp", "topic:database"]
+ }'
+```
+
+Recall:
+
+```bash
+curl -X POST http://localhost:3737/v1/recall \
+ -H "Content-Type: application/json" \
+ -d '{"query":"database decisions","limit":5}'
+```
+
+Learn from a bug:
+
+```bash
+curl -X POST http://localhost:3737/v1/learn \
+ -H "Content-Type: application/json" \
+ -d '{
+ "error": "API timeout",
+ "fix": "Increased connection pool size",
+ "rootCause": "Pool exhaustion",
+ "severity": "major"
+ }'
+```
+
+## Connector Prompt
+
+Use this with your local bridge or Custom GPT Action:
+
+```markdown
+You have access to MeMesh persistent memory through an HTTP connector.
+
+Use memory for:
+- architecture decisions and rationale
+- bug fixes, root causes, and prevention notes
+- project conventions and coding patterns
+- user preferences that should persist
+
+Before answering project-specific questions, recall relevant memories.
+After important decisions, fixes, or lessons, store concise memories with tags such as project:, topic:, and tech:.
+Do not invent memory results. If the connector is unavailable, say so and continue without memory.
+```
+
+## Security Notes
+
+- Do not expose `memesh serve` directly to the public internet.
+- Put authentication in front of any HTTPS bridge or tunnel.
+- Keep MeMesh local unless you intentionally build a controlled proxy.
+
+See [Universal Integration Guide](./universal.md) and [API Reference](../api/API_REFERENCE.md).
diff --git a/docs/platforms/gemini.md b/docs/platforms/gemini.md
new file mode 100644
index 00000000..c600126c
--- /dev/null
+++ b/docs/platforms/gemini.md
@@ -0,0 +1,78 @@
+# MeMesh With Google Gemini
+
+Gemini can use MeMesh when your application code or local bridge calls the MeMesh HTTP API. Gemini web or AI Studio system instructions alone do not automatically call `localhost`.
+
+## Supported Shape
+
+Use this guide when you control one of these:
+
+- A Gemini API application
+- A local tool wrapper around Gemini
+- A private connector/proxy that can reach local MeMesh
+
+For direct Gemini web chat, use MeMesh manually through the CLI unless you have a connector.
+
+## Start MeMesh
+
+```bash
+npm install -g @pcircle/memesh
+memesh serve
+```
+
+Default endpoints:
+
+```text
+API: http://localhost:3737/v1
+Dashboard: http://localhost:3737/dashboard
+```
+
+Verify:
+
+```bash
+curl http://localhost:3737/v1/health
+```
+
+## HTTP Operations
+
+Remember:
+
+```bash
+curl -X POST http://localhost:3737/v1/remember \
+ -H "Content-Type: application/json" \
+ -d '{
+ "name": "fastapi-backend-decision",
+ "type": "decision",
+ "observations": ["Use FastAPI for automatic OpenAPI docs"],
+ "tags": ["project:api", "tech:fastapi", "topic:backend"]
+ }'
+```
+
+Recall:
+
+```bash
+curl -X POST http://localhost:3737/v1/recall \
+ -H "Content-Type: application/json" \
+ -d '{"query":"backend framework","limit":5}'
+```
+
+## Gemini System Instruction
+
+Use this in a Gemini API app that has tools or application code wired to MeMesh:
+
+```markdown
+You have access to MeMesh persistent memory through application-provided tools.
+
+Recall relevant memories before making project-specific recommendations.
+Store durable decisions, bug lessons, architectural constraints, and coding patterns.
+Keep memories concise and tagged by project, topic, and technology.
+If the memory connector is unavailable, do not pretend memory was checked.
+```
+
+## Minimal App Flow
+
+1. User asks a project question.
+2. Your app calls `POST /v1/recall` with the user's query.
+3. Your app includes the returned memories in the Gemini request.
+4. After Gemini identifies a durable decision or lesson, your app calls `POST /v1/remember` or `POST /v1/learn`.
+
+See [Universal Integration Guide](./universal.md) and [API Reference](../api/API_REFERENCE.md).
diff --git a/docs/platforms/universal.md b/docs/platforms/universal.md
new file mode 100644
index 00000000..500fa22b
--- /dev/null
+++ b/docs/platforms/universal.md
@@ -0,0 +1,133 @@
+# MeMesh Universal Integration Guide
+
+Use MeMesh with any local app, script, or AI client that can call HTTP endpoints or run CLI commands.
+
+## Install
+
+```bash
+npm install -g @pcircle/memesh
+```
+
+## Option A: HTTP API
+
+Start the server:
+
+```bash
+memesh serve
+```
+
+Default base URL:
+
+```text
+http://localhost:3737
+```
+
+Health check:
+
+```bash
+curl http://localhost:3737/v1/health
+```
+
+Remember:
+
+```bash
+curl -X POST http://localhost:3737/v1/remember \
+ -H "Content-Type: application/json" \
+ -d '{
+ "name": "project-decision-2026",
+ "type": "decision",
+ "observations": ["We chose PostgreSQL"],
+ "tags": ["project:myapp", "topic:database"]
+ }'
+```
+
+Recall:
+
+```bash
+curl -X POST http://localhost:3737/v1/recall \
+ -H "Content-Type: application/json" \
+ -d '{"query":"database decisions","limit":5}'
+```
+
+Learn from a fix:
+
+```bash
+curl -X POST http://localhost:3737/v1/learn \
+ -H "Content-Type: application/json" \
+ -d '{
+ "error": "API timeout",
+ "fix": "Increased connection pool",
+ "rootCause": "Pool exhaustion",
+ "severity": "major"
+ }'
+```
+
+Archive:
+
+```bash
+curl -X POST http://localhost:3737/v1/forget \
+ -H "Content-Type: application/json" \
+ -d '{"name":"outdated-decision"}'
+```
+
+## Option B: CLI
+
+Use this when your workflow can run shell commands:
+
+```bash
+memesh remember --name "test" --type note --obs "Hello World"
+memesh recall "test" --json
+memesh learn --error "CORS error" --fix "Added CORS middleware" --root-cause "Missing headers"
+```
+
+## Option C: MCP
+
+Use MCP mode for clients that support Model Context Protocol:
+
+```bash
+memesh-mcp
+```
+
+Add that command to your MCP client's server configuration.
+
+## Prompt Template For Custom Integrations
+
+```markdown
+You have access to MeMesh persistent memory through the host application.
+
+Use recall before project-specific work where prior decisions, bug fixes, or conventions may matter.
+Use remember for durable decisions, architecture choices, project conventions, and recurring patterns.
+Use learn for mistakes, root causes, fixes, and prevention notes.
+Keep memories concise. Include tags such as project:, topic:, and tech:.
+Do not claim memory was checked or written unless the tool/API call succeeded.
+```
+
+## Operational Notes
+
+- Default database: `~/.memesh/knowledge-graph.db`
+- Default HTTP host: `127.0.0.1`
+- Default HTTP port: `3737`
+- Dashboard: `http://localhost:3737/dashboard`
+- API docs: [API_REFERENCE.md](../api/API_REFERENCE.md)
+
+## Troubleshooting
+
+Connection refused:
+
+```bash
+memesh serve
+curl http://localhost:3737/v1/health
+```
+
+Port conflict:
+
+```bash
+MEMESH_HTTP_PORT=8080 memesh serve
+```
+
+No memories found:
+
+```bash
+memesh remember --name test --type note --obs "Hello"
+memesh recall "test"
+```
diff --git a/scripts/hooks/pre-edit-recall.js b/scripts/hooks/pre-edit-recall.js
index e501563e..5975c768 100644
--- a/scripts/hooks/pre-edit-recall.js
+++ b/scripts/hooks/pre-edit-recall.js
@@ -6,12 +6,14 @@
import { createRequire } from 'module';
import { homedir } from 'os';
-import { join, basename } from 'path';
+import { dirname, join, basename } from 'path';
import { existsSync, readFileSync, writeFileSync, mkdirSync } from 'fs';
const require = createRequire(import.meta.url);
-const THROTTLE_FILE = join(homedir(), '.memesh', 'session-recalled-files.json');
+const dbPath = process.env.MEMESH_DB_PATH || join(homedir(), '.memesh', 'knowledge-graph.db');
+const memeshDir = process.env.MEMESH_DB_PATH ? dirname(process.env.MEMESH_DB_PATH) : join(homedir(), '.memesh');
+const THROTTLE_FILE = join(memeshDir, 'session-recalled-files.json');
const MAX_RESULTS = 3;
let input = '';
@@ -47,8 +49,6 @@ process.stdin.on('end', () => {
return pass();
}
- // Find database
- const dbPath = process.env.MEMESH_DB_PATH || join(homedir(), '.memesh', 'knowledge-graph.db');
if (!existsSync(dbPath)) return pass();
const Database = require('better-sqlite3');
@@ -163,7 +163,6 @@ function recordSeen(seenFiles, fileKey) {
seenFiles.push(fileKey);
// Cap at 100 to prevent unbounded growth
if (seenFiles.length > 100) seenFiles = seenFiles.slice(-50);
- const memeshDir = join(homedir(), '.memesh');
if (!existsSync(memeshDir)) mkdirSync(memeshDir, { recursive: true });
writeFileSync(THROTTLE_FILE, JSON.stringify(seenFiles), 'utf8');
} catch {
diff --git a/scripts/hooks/session-start.js b/scripts/hooks/session-start.js
index e513874e..7100fe66 100755
--- a/scripts/hooks/session-start.js
+++ b/scripts/hooks/session-start.js
@@ -2,13 +2,16 @@
import { createRequire } from 'module';
import { homedir } from 'os';
-import { join, basename } from 'path';
+import { dirname, join, basename } from 'path';
import { existsSync, writeFileSync, mkdirSync, unlinkSync } from 'fs';
-import { dirname } from 'path';
import { fileURLToPath } from 'url';
const require = createRequire(import.meta.url);
+const dbPath = process.env.MEMESH_DB_PATH || join(homedir(), '.memesh', 'knowledge-graph.db');
+const memeshDir = process.env.MEMESH_DB_PATH ? dirname(process.env.MEMESH_DB_PATH) : join(homedir(), '.memesh');
+const throttlePath = join(memeshDir, 'session-recalled-files.json');
+
let input = '';
process.stdin.setEncoding('utf8');
process.stdin.on('data', (chunk) => { input += chunk; });
@@ -19,7 +22,6 @@ process.stdin.on('end', async () => {
// Clear pre-edit recall throttle from previous session
try {
- const throttlePath = join(homedir(), '.memesh', 'session-recalled-files.json');
if (existsSync(throttlePath)) {
unlinkSync(throttlePath);
}
@@ -27,8 +29,6 @@ process.stdin.on('end', async () => {
// Non-critical
}
- // Find database
- const dbPath = process.env.MEMESH_DB_PATH || join(homedir(), '.memesh', 'knowledge-graph.db');
if (!existsSync(dbPath)) {
output('MeMesh: No database found. Memories will be created as you work.');
return;
@@ -188,7 +188,6 @@ process.stdin.on('end', async () => {
});
if (allInjected.length > 0) {
- const memeshDir = join(homedir(), '.memesh');
const sessionsDir = join(memeshDir, 'sessions');
if (!existsSync(sessionsDir)) mkdirSync(sessionsDir, { recursive: true });
diff --git a/src/core/embedder.ts b/src/core/embedder.ts
index 3068ffaf..11a4185a 100644
--- a/src/core/embedder.ts
+++ b/src/core/embedder.ts
@@ -15,6 +15,8 @@ let onnxPipelineInstance: any = null;
let onnxPipelineLoading: Promise | null = null;
let onnxAvailableChecked = false;
let onnxAvailableResult = false;
+const MAX_VECTOR_DISTANCE = 1;
+const pendingEmbeddingWrites = new Set>();
// --- Public API ---
@@ -42,6 +44,37 @@ export function resetEmbeddingState(): void {
onnxPipelineLoading = null;
}
+export function scheduleEmbedAndStore(entityId: number, text: string): void {
+ const pending = embedAndStore(entityId, text);
+ const tracked = pending.finally(() => {
+ pendingEmbeddingWrites.delete(tracked);
+ });
+ pendingEmbeddingWrites.add(tracked);
+}
+
+export async function flushPendingEmbeddings(): Promise {
+ while (pendingEmbeddingWrites.size > 0) {
+ await Promise.allSettled([...pendingEmbeddingWrites]);
+ }
+}
+
+function toVectorRowId(entityId: number): bigint {
+ if (!Number.isSafeInteger(entityId) || entityId <= 0) {
+ throw new Error(`Invalid entity id for vector storage: ${entityId}`);
+ }
+ return BigInt(entityId);
+}
+
+function toVectorBlob(embedding: Float32Array): Buffer {
+ return Buffer.from(embedding.buffer, embedding.byteOffset, embedding.byteLength);
+}
+
+function isDatabaseLifecycleError(err: unknown): boolean {
+ if (!err || typeof err !== 'object' || !('message' in err)) return false;
+ const message = String(err.message);
+ return message === 'Database not opened' || message.includes('database connection is not open');
+}
+
/**
* Generate an embedding for the given text.
* Tries providers in order: OpenAI API → Ollama → ONNX → null.
@@ -65,10 +98,10 @@ export async function embedText(text: string): Promise {
* Validates dimension matches before writing to prevent silent failures.
*/
export async function embedAndStore(entityId: number, text: string): Promise {
- const embedding = await embedText(text);
- if (!embedding) return;
-
try {
+ const embedding = await embedText(text);
+ if (!embedding) return;
+
const db = getDatabase();
// CRITICAL: Validate embedding dimension matches DB schema
@@ -91,10 +124,28 @@ export async function embedAndStore(entityId: number, text: string): Promise {
+ // sqlite-vec does not reliably honor INSERT OR REPLACE for vec0 primary keys.
+ db.prepare('DELETE FROM entities_vec WHERE rowid = ?').run(rowId);
+ db.prepare('INSERT INTO entities_vec (rowid, embedding) VALUES (?, ?)').run(
+ rowId,
+ toVectorBlob(embedding)
+ );
+ });
+ writeVector();
} catch (err) {
+ if (isDatabaseLifecycleError(err)) return;
+
// DB write failed — log and skip
if (err && typeof err === 'object' && 'message' in err) {
process.stderr.write(`MeMesh: Vector write failed for entity ${entityId}: ${err.message}\n`);
@@ -111,14 +162,15 @@ export function vectorSearch(
): Array<{ id: number; distance: number }> {
try {
const db = getDatabase();
- return db
+ const rows = db
.prepare(
'SELECT rowid AS id, distance FROM entities_vec WHERE embedding MATCH ? ORDER BY distance LIMIT ?'
)
.all(
- Buffer.from(queryEmbedding.buffer),
+ toVectorBlob(queryEmbedding),
limit
) as Array<{ id: number; distance: number }>;
+ return rows.filter((hit) => hit.distance < MAX_VECTOR_DISTANCE);
} catch {
return [];
}
diff --git a/src/core/operations.ts b/src/core/operations.ts
index 3174ccfd..9e04dcf6 100644
--- a/src/core/operations.ts
+++ b/src/core/operations.ts
@@ -14,7 +14,7 @@ import { KnowledgeGraph } from '../knowledge-graph.js';
import { expandQuery, isExpansionAvailable } from './query-expander.js';
import { rankEntities } from './scoring.js';
import { createExplicitLesson } from './lesson-engine.js';
-import { embedAndStore, isEmbeddingAvailable, embedText, vectorSearch } from './embedder.js';
+import { embedAndStore, isEmbeddingAvailable, embedText, scheduleEmbedAndStore, vectorSearch } from './embedder.js';
import { autoTagAndApply } from './auto-tagger.js';
import { detectCapabilities } from './config.js';
import path from 'path';
@@ -29,6 +29,10 @@ import type {
Entity,
} from './types.js';
+function recallTagFilter(args: RecallInput): string | undefined {
+ return args.cross_project ? undefined : args.tag;
+}
+
/**
* Store knowledge as an entity with observations, tags, and relations.
* If entity exists, appends observations and dedupes tags.
@@ -75,7 +79,7 @@ export function remember(args: RememberInput): RememberResult {
// Fire-and-forget: generate embedding asynchronously (don't block sync remember)
if (isEmbeddingAvailable() && args.observations?.length) {
const text = `${args.name} ${args.observations.join(' ')}`;
- embedAndStore(entityId, text).catch(() => {});
+ scheduleEmbedAndStore(entityId, text);
}
// Fire-and-forget: auto-generate tags if none provided and LLM is configured
@@ -111,7 +115,7 @@ export function recall(args: RecallInput): Entity[] {
// cross_project=true means don't filter by project tag — pass no tag to search all projects
const entities = kg.search(args.query, {
- tag: args.cross_project ? undefined : args.tag,
+ tag: recallTagFilter(args),
limit: args.limit,
includeArchived: args.include_archived,
namespace: args.namespace,
@@ -149,7 +153,7 @@ export async function recallEnhanced(args: RecallInput): Promise {
const termRelevance = i === 0 ? 1.0 : 0.7;
// cross_project=true means don't filter by project tag
const results = kg.search(expandedTerms[i], {
- tag: args.cross_project ? undefined : args.tag,
+ tag: recallTagFilter(args),
limit: args.limit,
includeArchived: args.include_archived,
namespace: args.namespace,
@@ -178,8 +182,9 @@ export async function recallEnhanced(args: RecallInput): Promise {
const kg2 = new KnowledgeGraph(db);
const hitIds = vectorHits.map(h => h.id);
const hitEntities = kg2.getEntitiesByIds(hitIds, {
- includeArchived: args.include_archived,
+ includeArchived: args.include_archived === true,
namespace: args.namespace,
+ tag: recallTagFilter(args),
});
for (let i = 0; i < hitEntities.length; i++) {
const entity = hitEntities[i];
@@ -207,7 +212,7 @@ export async function recallEnhanced(args: RecallInput): Promise {
// Level 0: regular FTS5 search (no LLM expansion)
// cross_project=true means don't filter by project tag
const entities = kg.search(args.query, {
- tag: args.cross_project ? undefined : args.tag,
+ tag: recallTagFilter(args),
limit: args.limit,
includeArchived: args.include_archived,
namespace: args.namespace,
@@ -230,8 +235,9 @@ export async function recallEnhanced(args: RecallInput): Promise {
const kg2 = new KnowledgeGraph(db);
const hitIds = vectorHits.map(h => h.id);
const hitEntities = kg2.getEntitiesByIds(hitIds, {
- includeArchived: args.include_archived,
+ includeArchived: args.include_archived === true,
namespace: args.namespace,
+ tag: recallTagFilter(args),
});
for (let i = 0; i < hitEntities.length; i++) {
const entity = hitEntities[i];
diff --git a/src/knowledge-graph.ts b/src/knowledge-graph.ts
index aaebaf40..84d39843 100644
--- a/src/knowledge-graph.ts
+++ b/src/knowledge-graph.ts
@@ -158,7 +158,7 @@ export class KnowledgeGraph {
getEntitiesByIds(
ids: number[],
- opts?: { includeArchived?: boolean; namespace?: string }
+ opts?: { includeArchived?: boolean; namespace?: string; tag?: string }
): Entity[] {
if (ids.length === 0) return [];
@@ -243,6 +243,7 @@ export class KnowledgeGraph {
const observations = obsMap.get(id) ?? [];
const tags = tagMap.get(id) ?? [];
const relations = relMap.get(id) ?? [];
+ if (opts?.tag && !tags.includes(opts.tag)) continue;
results.push({
id: row.id,
@@ -535,7 +536,7 @@ export class KnowledgeGraph {
try {
this.db
.prepare('DELETE FROM entities_vec WHERE rowid = ?')
- .run(row.id);
+ .run(BigInt(row.id));
} catch {
// Vector entry may not exist if embeddings not enabled — ignore
}
diff --git a/src/transports/cli/cli.ts b/src/transports/cli/cli.ts
index f2fb7d1d..c0abafa1 100644
--- a/src/transports/cli/cli.ts
+++ b/src/transports/cli/cli.ts
@@ -8,6 +8,7 @@ import { openDatabase, closeDatabase, getDatabase } from '../../db.js';
import { remember, recallEnhanced, forget, consolidate, exportMemories, importMemories, learn, reindex } from '../../core/operations.js';
import { KnowledgeGraph } from '../../knowledge-graph.js';
import { readConfig, updateConfig, maskApiKey, detectCapabilities } from '../../core/config.js';
+import { flushPendingEmbeddings } from '../../core/embedder.js';
const packageJsonPath = path.resolve(
path.dirname(fileURLToPath(import.meta.url)),
@@ -31,7 +32,7 @@ program
.option('--tags ', 'Tags (space-separated)')
.option('--namespace ', 'Namespace: personal, team, or global (default: personal)')
.option('--json', 'Output as JSON')
- .action((opts) => {
+ .action(async (opts) => {
openDatabase();
try {
const result = remember({
@@ -46,6 +47,7 @@ program
} else {
console.log(`✅ Stored "${result.name}" (${result.observations} observations, ${result.tags} tags)`);
}
+ await flushPendingEmbeddings();
} finally {
closeDatabase();
}
diff --git a/tests/core/embedder.test.ts b/tests/core/embedder.test.ts
index c155f552..f8d4fb89 100644
--- a/tests/core/embedder.test.ts
+++ b/tests/core/embedder.test.ts
@@ -1,15 +1,40 @@
-import { describe, it, expect, beforeEach } from 'vitest';
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import {
isEmbeddingAvailable,
resetEmbeddingState,
getEmbeddingDimension,
+ vectorSearch,
} from '../../src/core/embedder.js';
+import { closeDatabase, getDatabase, openDatabase } from '../../src/db.js';
+import fs from 'fs';
+import path from 'path';
+import os from 'os';
describe('Embedder', () => {
+ let testDir: string | undefined;
+
beforeEach(() => {
resetEmbeddingState();
});
+ afterEach(() => {
+ try { closeDatabase(); } catch {}
+ if (testDir) {
+ fs.rmSync(testDir, { recursive: true, force: true });
+ testDir = undefined;
+ }
+ });
+
+ function openTempDb() {
+ testDir = path.join(
+ os.tmpdir(),
+ `memesh-embedder-test-${Date.now()}-${Math.random().toString(36).slice(2)}`
+ );
+ fs.mkdirSync(testDir, { recursive: true });
+ openDatabase(path.join(testDir, 'test.db'));
+ return getDatabase();
+ }
+
it('isEmbeddingAvailable returns boolean', () => {
const result = isEmbeddingAvailable();
expect(typeof result).toBe('boolean');
@@ -47,4 +72,21 @@ describe('Embedder', () => {
const dim = getEmbeddingDimension();
expect([384, 768, 1536]).toContain(dim);
});
+
+ it('vectorSearch returns entity rowids stored in sqlite-vec', () => {
+ const db = openTempDb();
+ const dim = getEmbeddingDimension();
+ const embedding = new Float32Array(dim);
+ embedding.fill(0.01);
+ embedding[0] = 1;
+
+ db.prepare(
+ 'INSERT INTO entities_vec (rowid, embedding) VALUES (?, ?)'
+ ).run(123n, Buffer.from(embedding.buffer, embedding.byteOffset, embedding.byteLength));
+
+ const hits = vectorSearch(embedding, 1);
+ expect(hits).toHaveLength(1);
+ expect(hits[0].id).toBe(123);
+ expect(hits[0].distance).toBe(0);
+ });
});
diff --git a/tests/db.test.ts b/tests/db.test.ts
index 2e457914..77cfe6f9 100644
--- a/tests/db.test.ts
+++ b/tests/db.test.ts
@@ -3,6 +3,7 @@ import { openDatabase, closeDatabase, getDatabase } from '../src/db.js';
import fs from 'fs';
import path from 'path';
import os from 'os';
+import { getEmbeddingDimension } from '../src/core/config.js';
describe('Feature: Database Management', () => {
let testDir: string;
@@ -137,6 +138,45 @@ describe('Feature: Database Management', () => {
).all();
expect(tables).toHaveLength(1);
});
+
+ it('should accept explicit entity rowids for sqlite-vec storage', () => {
+ const db = openDatabase(testDbPath);
+ const embedding = new Float32Array(getEmbeddingDimension());
+ embedding.fill(0.01);
+ embedding[0] = 1;
+
+ expect(() => {
+ db.prepare(
+ 'INSERT INTO entities_vec (rowid, embedding) VALUES (?, ?)'
+ ).run(1n, Buffer.from(embedding.buffer, embedding.byteOffset, embedding.byteLength));
+ }).not.toThrow();
+ });
+
+ it('should support replacing an entity vector via delete then insert', () => {
+ const db = openDatabase(testDbPath);
+ const first = new Float32Array(getEmbeddingDimension());
+ first.fill(0.01);
+ first[0] = 1;
+ const second = new Float32Array(getEmbeddingDimension());
+ second.fill(0.02);
+ second[1] = 1;
+
+ const writeVector = (embedding: Float32Array) => {
+ db.prepare('DELETE FROM entities_vec WHERE rowid = ?').run(1n);
+ db.prepare('INSERT INTO entities_vec (rowid, embedding) VALUES (?, ?)').run(
+ 1n,
+ Buffer.from(embedding.buffer, embedding.byteOffset, embedding.byteLength)
+ );
+ };
+
+ expect(() => {
+ writeVector(first);
+ writeVector(second);
+ }).not.toThrow();
+
+ const count = db.prepare('SELECT count(*) AS count FROM entities_vec').get() as { count: number };
+ expect(count.count).toBe(1);
+ });
});
describe('Scenario: Scoring and temporal columns migration (v2.14 -> v2.15)', () => {
diff --git a/tests/hooks/pre-edit-recall.test.ts b/tests/hooks/pre-edit-recall.test.ts
index c58638da..c22d9757 100644
--- a/tests/hooks/pre-edit-recall.test.ts
+++ b/tests/hooks/pre-edit-recall.test.ts
@@ -28,7 +28,7 @@ describe('Feature: Pre-Edit Recall Hook', () => {
try {
return execFileSync('node', [hookPath], {
input: jsonInput,
- env: { ...process.env, MEMESH_DB_PATH: dbPath, HOME: testDir },
+ env: { ...process.env, MEMESH_DB_PATH: dbPath },
encoding: 'utf8',
timeout: 10000,
}).trim();
@@ -116,6 +116,21 @@ describe('Feature: Pre-Edit Recall Hook', () => {
expect(result2).toBe('');
});
+ it('should scope throttle state to MEMESH_DB_PATH directory', () => {
+ const db = createTestDb();
+ db.prepare('INSERT INTO entities (name, type) VALUES (?, ?)').run('auth-decision', 'decision');
+ const row = db.prepare('SELECT id FROM entities WHERE name = ?').get('auth-decision') as any;
+ db.prepare('INSERT INTO observations (entity_id, content) VALUES (?, ?)').run(row.id, 'Use OAuth 2.0');
+ db.prepare('INSERT INTO tags (entity_id, tag) VALUES (?, ?)').run(row.id, 'file:auth');
+ const projectName = path.basename(process.cwd());
+ db.prepare('INSERT INTO tags (entity_id, tag) VALUES (?, ?)').run(row.id, `project:${projectName}`);
+ db.close();
+
+ runHook({ tool_input: { file_path: '/src/auth.ts' } });
+
+ expect(fs.existsSync(path.join(testDir, 'session-recalled-files.json'))).toBe(true);
+ });
+
it('should return empty when no file_path in tool_input', () => {
createTestDb().close();
const result = runHook({ tool_input: { command: 'ls' } });
diff --git a/tests/hooks/session-start.test.ts b/tests/hooks/session-start.test.ts
index 394429ab..c21b88a7 100644
--- a/tests/hooks/session-start.test.ts
+++ b/tests/hooks/session-start.test.ts
@@ -202,6 +202,21 @@ describe('Feature: Session Start Hook', () => {
expect(typeof parsed.systemMessage).toBe('string');
});
+ it('Scenario: Clears pre-edit throttle state beside MEMESH_DB_PATH', () => {
+ const db = createTestDb();
+ db.prepare('INSERT INTO entities (name, type) VALUES (?, ?)').run('auth-decision', 'decision');
+ db.prepare('INSERT INTO observations (entity_id, content) VALUES (?, ?)').run(1, 'Use OAuth 2.0');
+ db.prepare('INSERT INTO tags (entity_id, tag) VALUES (?, ?)').run(1, 'project:anyproject');
+ db.close();
+
+ const throttlePath = path.join(testDir, 'session-recalled-files.json');
+ fs.writeFileSync(throttlePath, JSON.stringify(['/src/auth.ts']), 'utf8');
+
+ runHook({ cwd: '/tmp/anyproject' });
+
+ expect(fs.existsSync(throttlePath)).toBe(false);
+ });
+
it('Scenario: Scoring — top entities by score are listed first', () => {
const db = createScoringDb();
// Low-score entity (never accessed, low confidence)
diff --git a/tests/knowledge-graph.test.ts b/tests/knowledge-graph.test.ts
index 75c61ba8..573f70a1 100644
--- a/tests/knowledge-graph.test.ts
+++ b/tests/knowledge-graph.test.ts
@@ -2,6 +2,7 @@ import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { openDatabase, closeDatabase } from '../src/db.js';
import { KnowledgeGraph } from '../src/knowledge-graph.js';
import type { CreateEntityInput } from '../src/knowledge-graph.js';
+import { getEmbeddingDimension } from '../src/core/config.js';
import Database from 'better-sqlite3';
import fs from 'fs';
import path from 'path';
@@ -290,6 +291,24 @@ describe('Feature: Knowledge Graph', () => {
expect(results).toEqual([]);
});
+ it('should remove archived entity from vector index', () => {
+ const id = kg.createEntity('OldDesign', 'decision', {
+ observations: ['Use REST API'],
+ });
+ const embedding = new Float32Array(getEmbeddingDimension());
+ embedding.fill(0.01);
+ embedding[0] = 1;
+
+ db.prepare(
+ 'INSERT INTO entities_vec (rowid, embedding) VALUES (?, ?)'
+ ).run(BigInt(id), Buffer.from(embedding.buffer, embedding.byteOffset, embedding.byteLength));
+
+ kg.archiveEntity('OldDesign');
+
+ const count = db.prepare('SELECT count(*) AS count FROM entities_vec').get() as { count: number };
+ expect(count.count).toBe(0);
+ });
+
it('should return { archived: false } for non-existent entity', () => {
const result = kg.archiveEntity('Ghost');
expect(result).toEqual({ archived: false });
|