A self-hosted, API-compatible reimplementation of Supermemory — a memory layer for AI applications. Store documents, embed them automatically, and search by semantic similarity. Runs entirely in Docker, optionally behind Tailscale so nothing is exposed to the public internet.
Supermemory is a great product, but the backend is closed-source. The public repo only ships the frontend and client SDKs, and their official self-hosting option is enterprise-only (Cloudflare Workers).
This project reimplements the /v3 and /v4 API endpoints from scratch, reverse-engineered from the TypeScript SDK contract. Existing clients — including the official supermemory npm package — can point at your instance with no code changes.
| Component | Role |
|---|---|
| Hono | HTTP framework (Node.js) |
| Postgres 17 + pgvector | Document storage and vector search |
| Novita AI | Embedding generation (qwen/qwen3-embedding-8b, swappable) |
| Tailscale | Optional private networking (tailnet-only access) |
┌─────────────────────────────────────────────────────┐
│ Docker Compose │
│ │
│ ┌───────────┐ shared network ┌────────────┐ │
│ │ tailscale │◄────────────────────►│ api │ │
│ │ (optional)│ (port 8787) │ :8787 │ │
│ └───────────┘ └─────┬──────┘ │
│ 100.x.x.x │ │
│ (tailnet only) │ │
│ ┌────▼──────┐ │
│ │ postgres │ │
│ │ pgvector │ │
│ └───────────┘ │
│ (internal) │
└─────────────────────────────────────────────────────┘
With Tailscale enabled, the API container shares the Tailscale container's network stack (network_mode: service:tailscale). Port 8787 is reachable only via the Tailscale IP — not on localhost, not on LAN. Postgres is internal to the Docker network with no exposed ports.
Without Tailscale, you can expose port 8787 directly (see Running without Tailscale).
- Docker and Docker Compose
- A Tailscale auth key (reusable recommended) — skip if not using Tailscale
- A Novita AI API key (free tier available), or any OpenAI-compatible embedding provider
git clone https://github.com/s11ngh/supermemory-selfhosted.git
cd supermemory-selfhosted
cp .env.example .envEdit .env with your keys:
TS_AUTHKEY=tskey-auth-XXXXX # Tailscale auth key
NOVITA_API_KEY=sk_XXXXX # Novita AI API key
SUPERMEMORY_API_KEY= # Optional: require Bearer token authdocker compose up -dFirst run pulls images and builds the API container (~1 min). Database migrations run automatically on every startup and are idempotent.
With Tailscale:
docker compose exec tailscale tailscale ip -4
# → 100.x.x.xYour API is at http://100.x.x.x:8787.
Without Tailscale: http://localhost:8787 (see Running without Tailscale).
curl http://<YOUR_API_URL>:8787/health
# → {"status":"ok","version":"1.0.0"}curl -X POST http://<API_URL>:8787/v3/documents \
-H "Content-Type: application/json" \
-d '{"content": "The project uses Postgres with pgvector for embeddings"}'{"id": "e8920426-...", "status": "processed", "message": "Document added successfully"}curl -X POST http://<API_URL>:8787/v3/search \
-H "Content-Type: application/json" \
-d '{"q": "what database do we use?", "limit": 5}'{
"results": [
{
"id": "e8920426-...",
"content": "The project uses Postgres with pgvector for embeddings",
"score": 0.757,
"containerTag": "default",
"createdAt": "2026-02-23T23:57:11.810Z"
}
],
"count": 1
}Since this implements the same API contract, the official SDK works out of the box:
import Supermemory from "supermemory";
const client = new Supermemory({
apiKey: "your-SUPERMEMORY_API_KEY-if-set",
baseURL: "http://<API_URL>:8787",
});
await client.add({ content: "Remember this." });
const results = await client.search.documents({ q: "what should I remember?" });All endpoints match the supermemory SDK contract. If SUPERMEMORY_API_KEY is set, all /v3/* and /v4/* routes require Authorization: Bearer <key>. The /health endpoint is always open.
| Method | Endpoint | Description |
|---|---|---|
POST |
/v3/documents |
Add a document (auto-embeds) |
POST |
/v3/documents/batch |
Batch add documents |
POST |
/v3/documents/list |
List documents (paginated) |
GET |
/v3/documents/:id |
Get a document by ID |
PATCH |
/v3/documents/:id |
Update content (re-embeds) or metadata |
DELETE |
/v3/documents/:id |
Delete a document |
DELETE |
/v3/documents/bulk |
Bulk delete by IDs |
POST |
/v3/documents/file |
Upload and embed a file |
GET |
/v3/documents/processing |
List documents still processing |
| Method | Endpoint | Description |
|---|---|---|
POST |
/v3/search |
Semantic search (v3 response shape) |
POST |
/v4/search |
Semantic search (v4 response shape) |
| Method | Endpoint | Description |
|---|---|---|
DELETE |
/v4/memories |
Delete by IDs or container tag |
PATCH |
/v4/memories |
Update content or metadata |
| Method | Endpoint | Description |
|---|---|---|
GET |
/v3/settings |
Get all settings |
PATCH |
/v3/settings |
Merge new settings |
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check (no auth) |
POST |
/v4/profile |
Profile endpoint |
When you add a document, the API sends its text to an OpenAI-compatible embedding endpoint (Novita AI by default, using qwen/qwen3-embedding-8b). The model supports up to 4096 dimensions but we request 1536 via the dimensions parameter (Matryoshka representation) to balance quality and storage.
The resulting vector is stored alongside the document text in Postgres.
A search query is embedded the same way. Postgres uses the cosine distance operator (<=>) with an IVFFlat index to find the closest documents. Results are ranked by similarity score (0 to 1, higher = more relevant).
Migrations run on every container start (idempotent CREATE IF NOT EXISTS):
documents—id(UUID),content,metadata(JSONB),embedding(vector 1536),container_tag,status, timestampssettings— key-value JSONB store- Indexes — IVFFlat on embeddings (cosine), B-tree on
container_tagandcreated_at
If SUPERMEMORY_API_KEY is set in .env, all /v3/* and /v4/* endpoints require Authorization: Bearer <key>. The /health endpoint is always open. If the variable is empty, the API runs unauthenticated — fine when access is restricted to your Tailscale network.
Embedding logic lives in src/embeddings.ts. To use a different OpenAI-compatible provider (OpenAI, Together, Ollama, etc.), change three things:
// src/embeddings.ts
const EMBEDDING_MODEL = "text-embedding-3-small"; // model name
const EMBEDDING_DIMENSIONS = 1536; // must match DB column
function getClient(): OpenAI {
if (!client) {
client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY, // env var name
baseURL: "https://api.openai.com/v1", // provider URL
});
}
return client;
}If you change the dimension count, you'll need to drop and recreate the documents table (or alter the embedding column), since pgvector dimensions are fixed per column.
Remove the tailscale service from docker-compose.yml, drop network_mode: service:tailscale from supermemory-api, and add a port mapping:
supermemory-api:
build: .
depends_on:
db:
condition: service_healthy
ports:
- "8787:8787"
environment:
- DATABASE_URL=postgresql://supermemory:supermemory@db:5432/supermemory
- NOVITA_API_KEY=${NOVITA_API_KEY}
- SUPERMEMORY_API_KEY=${SUPERMEMORY_API_KEY:-}
- PORT=8787The API will be available at http://localhost:8787. Set SUPERMEMORY_API_KEY if exposing beyond localhost.
docker compose up -d # start
docker compose down # stop
docker compose logs -f # tail all logs
docker compose logs -f supermemory-api # tail API logs only
docker compose up -d --build supermemory-api # rebuild after code changes
docker compose exec tailscale tailscale status # check tailnet peersData persists across restarts in Docker volumes:
pgdata— Postgres data directorytailscale-state— Tailscale node identity
The plugin/ directory contains an OpenClaw memory plugin that gives AI agents persistent memory backed by this API.
- Auto-recall — Before each agent turn, searches memory for context relevant to the user's message and injects it
- Auto-capture — After each turn, detects factual statements ("I prefer...", "we use...", "remember that...") and stores them
- Agent tools — Exposes
memory_recallandmemory_storeas tools the agent can call directly - CLI commands —
openclaw supermemory health|search|addfor manual interaction
cp -r plugin/ ~/.openclaw/extensions/memory-supermemory/Add to ~/.openclaw/openclaw.json:
{
"plugins": {
"memory-supermemory": {
"apiUrl": "http://<YOUR_API_URL>:8787",
"apiKey": "",
"autoRecall": true,
"autoCapture": true,
"recallLimit": 3,
"minScore": 0.55
},
"slots": {
"memory": "memory-supermemory"
}
}
}openclaw plugins enable memory-supermemory
openclaw supermemory health # → {"status":"ok","version":"1.0.0"}
openclaw supermemory add "test" # → {"id":"...","status":"processed",...}
openclaw supermemory search "test" # → results with score| Option | Type | Default | Description |
|---|---|---|---|
apiUrl |
string | required | Base URL of the supermemory API |
apiKey |
string | "" |
Bearer token (leave blank if API has no auth) |
autoRecall |
boolean | true |
Search memory before each agent turn |
autoCapture |
boolean | true |
Store detected facts after each turn |
recallLimit |
number | 3 |
Max memories to retrieve per query |
minScore |
number | 0.55 |
Minimum similarity score (0-1) to include |
.
├── .env.example # Template for secrets
├── docker-compose.yml # Three services: tailscale, api, postgres
├── Dockerfile # Multi-stage build (tsc → Node 22 slim)
├── package.json
├── tsconfig.json
├── plugin/ # OpenClaw memory plugin
│ ├── openclaw.plugin.json
│ └── index.ts
└── src/
├── index.ts # Hono server, routing, auth middleware
├── db.ts # Postgres pool + pgvector type registration
├── migrate.ts # Schema migrations (idempotent, runs on startup)
├── embeddings.ts # Embedding client (OpenAI-compatible)
└── routes/
├── documents.ts # Document CRUD, batch, file upload
├── search.ts # v3 + v4 semantic search
├── settings.ts # Settings key-value store
└── memories.ts # Memory delete + update
MIT