vector_db_status hangs indefinitely when called via MCP (asyncio + LanceDB deadlock)

## Summary

`vector_db_status` (and likely `vector_search`) hangs indefinitely when called as an MCP tool. The tool never returns a response and the MCP server becomes unresponsive. The same operation completes instantly when called directly from Python outside of the MCP server.

## Environment

- **OS:** Windows 10 Home (10.0.19045)
- **mcp-logseq version:** `mcp-logseq[vector]==1.6.2`
- **Embedder:** `ollama/nomic-embed-text:latest` (768 dimensions)
- **DB contents:** ~10,348 chunks, 1,293 pages
- **MCP client:** Claude Code (Anthropic CLI)

## Steps to Reproduce

1. Configure `mcp-logseq[vector]==1.6.2` as an MCP server (stdio)
2. Call the `vector_db_status` tool from an MCP client
3. The tool call never returns — the server is permanently blocked

## Direct Python Call (works fine)

Calling the handler directly from Python outside asyncio completes in under 1 second:

```python
from mcp_logseq.config import load_vector_config
from mcp_logseq.vector.index import VectorDBStatusToolHandler

cfg = load_vector_config()
handler = VectorDBStatusToolHandler(cfg)
result = handler.run_tool({})
# Returns instantly:
# Vector DB Status
#   Embedder:     ollama/nomic-embed-text:latest
#   Dimensions:   768
#   Total chunks: 10348
#   Total pages:  1293
#   Last sync:    2026-04-30T15:07:42.683505+00:00
#   Staleness:    Out of date (3 changed, 0 deleted)
#   Watcher:      not running
```

## Via MCP (hangs forever)

When called via the MCP protocol, the server logs show it starts executing but never completes:

```
mcp-logseq - INFO - Tool call: vector_db_status with arguments {}
mcp-logseq - DEBUG - Running tool vector_db_status
# ... nothing ever after this
```

The MCP server was started with timing instrumentation showing the server takes ~4.4s to start up (npx + dotenv-cli + uv), responds to `initialize` correctly, but then hangs permanently on the `tools/call` request for `vector_db_status`.

## Root Cause Analysis

Two issues combine to cause this:

### Issue 1: Synchronous blocking call inside async context (`server.py` line 148)

```python
@app.call_tool()
async def call_tool(name: str, arguments: Any) -> ...:
    ...
    result = tool_handler.run_tool(arguments)  # ← synchronous, blocks the event loop
```

`run_tool` is synchronous and performs blocking I/O (LanceDB file access). Calling it directly inside an `async` function blocks the asyncio event loop thread. This prevents asyncio from processing any I/O completions, including those needed by the LanceDB Rust layer.

### Issue 2: Full table scan in `get_stats()` (`vector/db.py` line 241)

```python
def get_stats(self) -> dict:
    total = self._table.count_rows()
    pages_result = self._table.search().select(["page"]).to_list()  # ← fetches ALL rows
    unique_pages = len({r["page"] for r in pages_result})
```

`get_stats()` fetches **every row** in the table to count unique pages. With 10,348 chunks this is a large blocking scan. LanceDB's Rust layer uses its own tokio async runtime internally; when this blocking scan is called synchronously from within Python's asyncio event loop thread, the two runtimes conflict and deadlock — the scan never completes.

## Suggested Fixes

### Fix 1: Run tool handlers in a thread executor

In `server.py`, change the synchronous call to use `asyncio.to_thread`:

```python
@app.call_tool()
async def call_tool(name: str, arguments: Any) -> ...:
    ...
    try:
        logger.debug(f"Running tool {name}")
        result = await asyncio.to_thread(tool_handler.run_tool, arguments)  # ← offload to thread
        logger.debug(f"Tool result: {result}")
        return result
    except Exception as e:
        logger.error(f"Error running tool: {str(e)}", exc_info=True)
        raise RuntimeError(f"Error: {str(e)}")
```

Don't forget to add `import asyncio` at the top.

### Fix 2: Use an efficient aggregation for unique page count

In `vector/db.py`, replace the full table scan with a SQL aggregation:

```python
def get_stats(self) -> dict:
    try:
        total = self._table.count_rows()
        # Use aggregation instead of fetching all rows
        result = self._table.search().select(["page"]).to_pandas()
        unique_pages = result["page"].nunique()
        return {"total_chunks": total, "total_pages": unique_pages}
    except Exception as e:
        logger.warning(f"Could not get stats: {e}")
        return {"total_chunks": 0, "total_pages": 0}
```

Or even better, use a LanceDB SQL query if available:
```python
unique_pages = len(self._table.to_lance().to_table(columns=["page"]).column("page").unique())
```

Fix 1 is the critical one — it will likely fix the hang for all vector tools (`vector_search`, `sync_vector_db`, `vector_db_status`), not just this one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vector_db_status hangs indefinitely when called via MCP (asyncio + LanceDB deadlock) #44

Summary

Environment

Steps to Reproduce

Direct Python Call (works fine)

Via MCP (hangs forever)

Root Cause Analysis

Issue 1: Synchronous blocking call inside async context (`server.py` line 148)

Issue 2: Full table scan in `get_stats()` (`vector/db.py` line 241)

Suggested Fixes

Fix 1: Run tool handlers in a thread executor

Fix 2: Use an efficient aggregation for unique page count

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

vector_db_status hangs indefinitely when called via MCP (asyncio + LanceDB deadlock) #44

Description

Summary

Environment

Steps to Reproduce

Direct Python Call (works fine)

Via MCP (hangs forever)

Root Cause Analysis

Issue 1: Synchronous blocking call inside async context (server.py line 148)

Issue 2: Full table scan in get_stats() (vector/db.py line 241)

Suggested Fixes

Fix 1: Run tool handlers in a thread executor

Fix 2: Use an efficient aggregation for unique page count

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Issue 1: Synchronous blocking call inside async context (`server.py` line 148)

Issue 2: Full table scan in `get_stats()` (`vector/db.py` line 241)