Skip to content

vector_db_status hangs indefinitely when called via MCP (asyncio + LanceDB deadlock) #44

@Brikas

Description

@Brikas

Summary

vector_db_status (and likely vector_search) hangs indefinitely when called as an MCP tool. The tool never returns a response and the MCP server becomes unresponsive. The same operation completes instantly when called directly from Python outside of the MCP server.

Environment

  • OS: Windows 10 Home (10.0.19045)
  • mcp-logseq version: mcp-logseq[vector]==1.6.2
  • Embedder: ollama/nomic-embed-text:latest (768 dimensions)
  • DB contents: ~10,348 chunks, 1,293 pages
  • MCP client: Claude Code (Anthropic CLI)

Steps to Reproduce

  1. Configure mcp-logseq[vector]==1.6.2 as an MCP server (stdio)
  2. Call the vector_db_status tool from an MCP client
  3. The tool call never returns — the server is permanently blocked

Direct Python Call (works fine)

Calling the handler directly from Python outside asyncio completes in under 1 second:

from mcp_logseq.config import load_vector_config
from mcp_logseq.vector.index import VectorDBStatusToolHandler

cfg = load_vector_config()
handler = VectorDBStatusToolHandler(cfg)
result = handler.run_tool({})
# Returns instantly:
# Vector DB Status
#   Embedder:     ollama/nomic-embed-text:latest
#   Dimensions:   768
#   Total chunks: 10348
#   Total pages:  1293
#   Last sync:    2026-04-30T15:07:42.683505+00:00
#   Staleness:    Out of date (3 changed, 0 deleted)
#   Watcher:      not running

Via MCP (hangs forever)

When called via the MCP protocol, the server logs show it starts executing but never completes:

mcp-logseq - INFO - Tool call: vector_db_status with arguments {}
mcp-logseq - DEBUG - Running tool vector_db_status
# ... nothing ever after this

The MCP server was started with timing instrumentation showing the server takes ~4.4s to start up (npx + dotenv-cli + uv), responds to initialize correctly, but then hangs permanently on the tools/call request for vector_db_status.

Root Cause Analysis

Two issues combine to cause this:

Issue 1: Synchronous blocking call inside async context (server.py line 148)

@app.call_tool()
async def call_tool(name: str, arguments: Any) -> ...:
    ...
    result = tool_handler.run_tool(arguments)  # ← synchronous, blocks the event loop

run_tool is synchronous and performs blocking I/O (LanceDB file access). Calling it directly inside an async function blocks the asyncio event loop thread. This prevents asyncio from processing any I/O completions, including those needed by the LanceDB Rust layer.

Issue 2: Full table scan in get_stats() (vector/db.py line 241)

def get_stats(self) -> dict:
    total = self._table.count_rows()
    pages_result = self._table.search().select(["page"]).to_list()  # ← fetches ALL rows
    unique_pages = len({r["page"] for r in pages_result})

get_stats() fetches every row in the table to count unique pages. With 10,348 chunks this is a large blocking scan. LanceDB's Rust layer uses its own tokio async runtime internally; when this blocking scan is called synchronously from within Python's asyncio event loop thread, the two runtimes conflict and deadlock — the scan never completes.

Suggested Fixes

Fix 1: Run tool handlers in a thread executor

In server.py, change the synchronous call to use asyncio.to_thread:

@app.call_tool()
async def call_tool(name: str, arguments: Any) -> ...:
    ...
    try:
        logger.debug(f"Running tool {name}")
        result = await asyncio.to_thread(tool_handler.run_tool, arguments)  # ← offload to thread
        logger.debug(f"Tool result: {result}")
        return result
    except Exception as e:
        logger.error(f"Error running tool: {str(e)}", exc_info=True)
        raise RuntimeError(f"Error: {str(e)}")

Don't forget to add import asyncio at the top.

Fix 2: Use an efficient aggregation for unique page count

In vector/db.py, replace the full table scan with a SQL aggregation:

def get_stats(self) -> dict:
    try:
        total = self._table.count_rows()
        # Use aggregation instead of fetching all rows
        result = self._table.search().select(["page"]).to_pandas()
        unique_pages = result["page"].nunique()
        return {"total_chunks": total, "total_pages": unique_pages}
    except Exception as e:
        logger.warning(f"Could not get stats: {e}")
        return {"total_chunks": 0, "total_pages": 0}

Or even better, use a LanceDB SQL query if available:

unique_pages = len(self._table.to_lance().to_table(columns=["page"]).column("page").unique())

Fix 1 is the critical one — it will likely fix the hang for all vector tools (vector_search, sync_vector_db, vector_db_status), not just this one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions