Skip to content

orneryd/NornicDB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

545 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NornicDB Logo

NornicDB

The Graph Database That Learns
Neo4j-compatible • GPU-accelerated • Memory that evolves

Version 1.0.0 Coveralls Report Docker Neo4j Compatible Qdrant Compatible Compatible Go Version Go Report Card License

Quick StartWhy SwitchProblemWhy DifferentBenchmarksFeaturesDockerDocsContributors


What Problem Does This Solve?

NornicDB is a high-performance graph database designed for AI agents and knowledge systems. It speaks Neo4j's language (Bolt protocol + Cypher) so you can switch with zero code changes, while adding intelligent features that traditional databases lack.

NornicDB automatically discovers and manages relationships in your data, weaving connections that let meaning emerge from your knowledge graph.

Why NornicDB Is Different

  • Neo4j-compatible by default: Bolt + Cypher support for existing drivers and applications.
  • Built for AI-native workloads: vector search, memory decay, and auto-relationships are first-class features.
  • Hardware-accelerated execution: Metal/CUDA/Vulkan pathways for high-throughput graph + semantic workloads.
  • Operational flexibility: full images (models included), BYOM images, and headless API-only deployments.
  • Canonical graph ledger support: versioned facts, temporal validity, as-of reads, queryable txlog, and receipts for audit-oriented systems.

What Recent Deep-Dives Show

  • Hybrid execution model (streaming fast paths + general engine): NornicDB uses shape-specialized streaming executors for common traversal/aggregation patterns while retaining a general Cypher path for coverage and correctness.
  • Runtime parser mode switching: the default nornic mode minimizes hot-path overhead, while antlr mode prioritizes strict parsing and diagnostics when debugging/query validation matters.
  • Measured parser-path deltas on benchmark suites: internal Northwind comparisons show large overhead differences on certain query shapes when full parse-tree paths are used, which is why default production mode is optimized for lower overhead.
  • HNSW build acceleration from insertion-order optimization: BM25-seeded insertion order reduced a 1M embedding build from ~27 minutes to ~10 minutes (~2.7x) in published tests by reducing traversal waste during construction, without changing core quality knobs.
  • Shared seed strategy across indexing stages: the same lexical seed extraction supports HNSW insertion ordering and improves k-means centroid initialization spread for vector pipeline efficiency.

Read more:

Performance Snapshot

LDBC Social Network Benchmark (M3 Max, 64GB):

Query Type NornicDB Neo4j Speedup
Message content lookup 6,389 ops/sec 518 ops/sec 12x
Recent messages (friends) 2,769 ops/sec 108 ops/sec 25x
Avg friends per city 4,713 ops/sec 91 ops/sec 52x
Tag co-occurrence 2,076 ops/sec 65 ops/sec 32x

See full benchmark results for complete methodology and additional workloads.

Quick Start

Docker (Recommended)

# Apple Silicon (includes bge-m3 embedding model)
docker run -d --name nornicdb \
  -p 7474:7474 -p 7687:7687 \
  -v nornicdb-data:/data \
  timothyswt/nornicdb-arm64-metal-bge:latest  # Apple Silicon
  # timothyswt/nornicdb-amd64-cuda-bge:latest  # NVIDIA GPU

Open http://localhost:7474 for the admin UI.

Need a different image/profile (Heimdall, BYOM, CPU-only, Vulkan, headless)?

From Source

git clone https://github.com/orneryd/NornicDB.git
cd NornicDB
go build -o nornicdb ./cmd/nornicdb
./nornicdb serve

Connect

Use any Neo4j driver — Python, JavaScript, Go, Java, .NET:

from neo4j import GraphDatabase

driver = GraphDatabase.driver("bolt://localhost:7687")
with driver.session() as session:
    session.run("CREATE (n:Memory {content: 'Hello NornicDB'})")

Why Switch from Neo4j?

  • 12x-52x faster on published LDBC workloads (same hardware comparisons).
  • Native graph + vector in one engine (no separate vector sidecar required).
  • GPU acceleration paths (Metal/CUDA/Vulkan) for semantic + graph workloads.
  • Drop-in compatibility via Bolt + Cypher for existing applications.
  • Canonical graph ledger model for temporal validity, as-of reads, and audit-oriented mutation tracking.

Why Switch from Qdrant?

  • Graph + vector in one engine: combine semantic retrieval with native graph traversal and Cypher queries.
  • Qdrant gRPC compatibility preserved: keep Qdrant-style gRPC workflows while adding graph-native capabilities.
  • Hybrid retrieval built in: vector + BM25 fusion and optional reranking in the same query pipeline.
  • Canonical truth modeling: versioned facts, temporal validity windows, and as-of reads for governance-heavy use cases.
  • Protocol flexibility: use REST, GraphQL, Bolt/Cypher, Qdrant-compatible gRPC, and additive Nornic gRPC on one platform.

Build It Yourself

Detailed local build, cross-compile, and packaging instructions:

Features

🔌 Neo4j Compatible

Drop-in replacement for Neo4j. Your existing code works unchanged.

  • Bolt Protocol — Use official Neo4j drivers
  • Cypher Queries — Full query language support
  • Schema Management — Constraints, indexes, vector indexes
  • Qdrant gRPC API Compatible — Works with Qdrant-style gRPC vector workflows

🧠 Intelligent Memory

Memory that behaves like human cognition.

Memory Tier Half-Life Use Case
Episodic 7 days Chat context, sessions
Semantic 69 days Facts, decisions
Procedural 693 days Skills, patterns
// Find memories that are still strong
MATCH (m:Memory) WHERE m.decayScore > 0.5
RETURN m.title ORDER BY m.decayScore DESC

🔗 Auto-Relationships

NornicDB weaves connections automatically:

  • Embedding Similarity — Related concepts link together
  • Co-access Patterns — Frequently queried pairs connect
  • Temporal Proximity — Same-session nodes associate
  • Transitive Inference — A→B + B→C suggests A→C

🎯 Vector Search

Native semantic search with GPU acceleration and hybrid retrieval support.

📖 Deep dive: Vector Search Guide and Qdrant gRPC Endpoint.

Cypher (Neo4j-compatible):

CALL db.index.vector.queryNodes('embeddings', 10, 'machine learning guide')
YIELD node, score
RETURN node.content, score

Hybrid search (REST):

curl -X POST http://localhost:7474/nornicdb/search \
  -H "Content-Type: application/json" \
  -d '{"query": "machine learning", "limit": 10}'

More API entry points:

  • GraphQL hybrid search: POST /graphql with search(query, options)
  • gRPC (Qdrant-compatible): Points.Search / Points.Query(Document.text)
  • Nornic native gRPC: NornicSearch/SearchText (additive client)
  • See docs/user-guides/nornic-search-grpc.md for additive proto setup without forking Qdrant drivers.

🤖 Heimdall AI Assistant

Built-in AI that understands your database.

# Enable Heimdall
NORNICDB_HEIMDALL_ENABLED=true ./nornicdb serve

Natural Language Queries:

  • "Get the database status"
  • "Show me system metrics"
  • "Run health check"

Plugin System:

  • Create custom actions the AI can execute
  • Lifecycle hooks (PrePrompt, PreExecute, PostExecute)
  • Database event monitoring for autonomous actions
  • Inline notifications with proper ordering

See Heimdall AI Assistant Guide and Plugin Development.

🧩 APOC Functions

950+ built-in functions for text, math, collections, and more. Plus a plugin system for custom extensions.

// Text processing
RETURN apoc.text.camelCase('hello world')  // "helloWorld"
RETURN apoc.text.slugify('Hello World!')   // "hello-world"

// Machine learning
RETURN apoc.ml.sigmoid(0)                  // 0.5
RETURN apoc.ml.cosineSimilarity([1,0], [0,1])  // 0.0

// Collections
RETURN apoc.coll.sum([1, 2, 3, 4, 5])      // 15

Drop custom .so plugins into /app/plugins/ for automatic loading. See the APOC Plugin Guide.

Docker Images

All images available at Docker Hub.

ARM64 (Apple Silicon)

Image Size Description
timothyswt/nornicdb-arm64-metal-bge-heimdall 1.1 GB Full - Embeddings + AI Assistant
timothyswt/nornicdb-arm64-metal-bge 586 MB Standard - With BGE-M3 embeddings
timothyswt/nornicdb-arm64-metal 148 MB Minimal - Core database, BYOM
timothyswt/nornicdb-arm64-metal-headless 148 MB Headless - API only, no UI

AMD64 (Linux/Intel)

Image Size Description
timothyswt/nornicdb-amd64-cuda-bge ~4.5 GB GPU + Embeddings - CUDA + BGE-M3
timothyswt/nornicdb-amd64-cuda ~3 GB GPU - CUDA acceleration, BYOM
timothyswt/nornicdb-amd64-cuda-headless ~2.9 GB GPU Headless - API only
timothyswt/nornicdb-amd64-cpu ~500 MB CPU - No GPU required
timothyswt/nornicdb-amd64-cpu-headless ~500 MB CPU Headless - API only

BYOM = Bring Your Own Model (mount at /app/models)

# With your own model
docker run -d -p 7474:7474 -p 7687:7687 \
  -v /path/to/models:/app/models \
  timothyswt/nornicdb-arm64-metal:latest

# Headless mode (API only, no web UI)
docker run -d -p 7474:7474 -p 7687:7687 \
  -v nornicdb-data:/data \
  timothyswt/nornicdb-arm64-metal-headless:latest

Headless Mode

For embedded deployments, microservices, or API-only use cases, NornicDB supports headless mode which disables the web UI for a smaller binary and reduced attack surface.

Runtime flag:

nornicdb serve --headless

Environment variable:

NORNICDB_HEADLESS=true nornicdb serve

Build without UI (smaller binary):

# Native build
make build-headless

# Docker build
docker build --build-arg HEADLESS=true -f docker/Dockerfile.arm64-metal .

Configuration

# nornicdb.yaml
server:
  bolt_port: 7687
  http_port: 7474
  host: localhost

database:
  data_dir: ./data
  async_writes_enabled: true
  async_flush_interval: 50ms
  async_max_node_cache_size: 50000
  async_max_edge_cache_size: 100000

embedding:
  enabled: true
  provider: local # or ollama, openai
  model: bge-m3.gguf
  url: ""
  dimensions: 1024

embedding_worker:
  chunk_size: 8192
  chunk_overlap: 50

memory:
  decay_enabled: true
  decay_interval: 1h
  auto_links_enabled: true
  auto_links_similarity_threshold: 0.82

Use Cases

  • AI Agent Memory — Persistent, queryable memory for LLM agents
  • Knowledge Graphs — Auto-organizing knowledge bases
  • RAG Systems — Vector + graph retrieval in one database
  • Graph-RAG for LLM Inference — Simplify retrieval pipelines by combining graph traversal, hybrid search, and provenance in one engine
  • Session Context — Decaying conversation history
  • Research Tools — Connect papers, notes, and insights
  • Canonical Truth Stores — Versioned facts, temporal validity, and append-only mutation history in a graph model
  • Financial Systems — Loan/risk state reconstruction with as-of reads and audit receipts
  • Compliance & RegTech — KYC/AML state changes, policy/rule versioning, and non-overlapping validity enforcement
  • Audit Platforms — Correlate graph mutations to WAL sequence ranges and receipt hashes
  • AI Governance & Lineage — Track model assertions, overrides, and fact provenance over time

Documentation

Guide Description
Getting Started Installation & quick start
Docker Image Quick Reference Full runtime image matrix
API Reference Cypher functions & procedures
User Guides Complete examples & patterns
Performance Benchmarks vs Neo4j
Neo4j Migration Compatibility & feature parity
Architecture System design & internals
Docker Guide Build & deployment
Development Contributing & development

Comparison

Platform Category Query Language Support (and protocol) Native Vector Search Canonical Graph + Temporal Ledger Pattern Queryable Mutation Log + Receipts Embedded/Self-Hosted Focus
NornicDB Graph + Vector + Canonical Ledger Cypher via Bolt; also HTTP/GraphQL and gRPC (Qdrant-compatible + NornicSearch) Yes Yes Yes Yes
Neo4j Graph DB Cypher via Bolt/HTTP Yes Partial (manual modeling) Partial (logs exist, not first-class receipts model) Server-first
Memgraph Graph DB openCypher via Bolt/HTTP Partial/varies by setup Partial (manual) Partial (manual/integration) Server-first
TigerGraph Graph analytics DB GSQL via REST++/native endpoints Partial/extension-driven Partial (manual) Partial (manual/integration) Server-first
Qdrant Vector DB Qdrant query/filter API via gRPC/REST Yes No (not graph-native) No Server-first
Weaviate Vector DB GraphQL + REST APIs Yes Partial (knowledge graph features, not Cypher property graph) No Server-first
Amazon QLDB Ledger DB PartiQL via AWS API/SDK No Partial (ledger + temporal history, not graph-native) Yes (ledger-native) Managed service

Snapshot is capability-oriented and high-level; exact behavior depends on edition/configuration and workload design.

Building

Native Binary

# Basic build
make build

# Headless (no UI)
make build-headless

# With local LLM support
make build-localllm

Docker Images

# Download models for Heimdall builds (automatic if missing)
make download-models        # BGE-M3 + qwen3-0.6b (~750MB)
make check-models          # Verify models present

# ARM64 (Apple Silicon)
make build-arm64-metal                  # Base (BYOM)
make build-arm64-metal-bge              # With BGE embeddings
make build-arm64-metal-bge-heimdall     # With BGE + Heimdall AI
make build-arm64-metal-headless         # Headless (no UI)

# AMD64 CUDA (NVIDIA GPU)
make build-amd64-cuda                   # Base (BYOM)
make build-amd64-cuda-bge               # With BGE embeddings
make build-amd64-cuda-bge-heimdall      # With BGE + Heimdall AI
make build-amd64-cuda-headless          # Headless (no UI)

# AMD64 CPU-only
make build-amd64-cpu                    # Minimal
make build-amd64-cpu-headless           # Minimal headless

# Build all variants for your architecture
make build-all

# Deploy to registry
make deploy-all             # Build + push all variants

Cross-Compilation

# Build for other platforms from macOS
make cross-linux-amd64     # Linux x86_64
make cross-linux-arm64     # Linux ARM64
make cross-rpi             # Raspberry Pi 4/5
make cross-windows         # Windows (CPU-only)
make cross-all             # All platforms

Roadmap

Completed

  • Neo4j Bolt protocol
  • Cypher query engine (52 functions)
  • Memory decay system
  • GPU acceleration (Metal, CUDA)
  • Vector & full-text search
  • Auto-relationship engine
  • HNSW vector index
  • Metadata/Property Indexing
  • SIMD Implementation
  • Clustering support

Planned (from docs/plans)

  • Hybrid retrieval Phase 1: adaptive vector/BM25 execution order, cost-aware switching, and query telemetry (docs/plans/scaling-search.md)
  • Distributed fabric Phase 1-2: QueryGateway, remote transport, shard routing, and distributed dispatch (docs/plans/sharding.md)
  • Distributed transactions and vector search across shards (Fabric Phases 3-4) (docs/plans/sharding.md)
  • Cluster admin APIs + UI/protocol integration for shard management (Fabric Phases 5-6) (docs/plans/sharding.md)
  • GDPR compliance hardening: user-data detection, relationship export/delete/anonymization, and audit-log coverage (docs/plans/gdpr-compliance-fixes.md)

Contributors

Special thanks to everyone who helps make NornicDB better. See CONTRIBUTORS.md for a list of community contributors.

License

MIT License — Originally part of the Mimir project, now maintained as a standalone repository.

See NOTICES.md for third-party license information, including bundled AI models (BGE-M3, Qwen2.5) and dependencies.


Weaving your data's destiny

About

NornicDB is a high-performance graph + vector database built for AI agents and knowledge systems. It speaks Neo4j's (Bolt + Cypher) and qdrant's (gRPC) languages so you can use Nornic with zero code changes, while adding intelligent features including a graphql endpoint, air-gapped embeddings, GPU accelerated search, and other intelligent features.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors