diff --git a/CHANGELOG.md b/CHANGELOG.md index 551c225..105842d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -25,10 +25,9 @@ skipped while we are still in 0.x. delivers cross-repo matches. An earlier draft also folded in the last commit subject, but a 6-token AND-FTS query is precision-extreme and produced zero matches across a real, sizeable cache. - Single-token ranking lets underscore/dash names tokenise naturally - (`pdf_to_quiz`, `reddit_promo_planner`) and BM25 picks the best - matches. Sanity check on three real projects returned 1+ relevant - cross-repo block each. + Single-token ranking lets underscore- and dash-naming tokenise + naturally and BM25 picks the best matches. Sanity check on three + real projects returned 1+ relevant cross-repo block each. - Scope decision: only `STEAL_HITS` is in v0.1. Other candidate fields (project status, last-touched timestamp, journal entries) were deferred — status carries a real risk of eroding trust in the whole diff --git a/docs/mcp.md b/docs/mcp.md index cfaf6a6..75f5e1b 100644 --- a/docs/mcp.md +++ b/docs/mcp.md @@ -80,6 +80,8 @@ treating the return as literal text. | `armillary_search` | `(query: str, max_results: int = 20) → str` | <50 ms per repo hit | | `armillary_projects` | `(status_filter: str \| None) → str` | <20 ms | | `armillary_pulse` | `() → str` | <30 ms | +| `armillary_steal` | `(query: str, limit: int = 5, language: str \| None) → str` | <100 ms | +| `armillary_revive` | `(project_path: str) → str` | <500 ms (revive subprocess + steal) | Schemas are introspected from the Python function signatures; the agent receives them in the `tools/list` response during the MCP handshake. @@ -90,7 +92,7 @@ receives them in the `tools/list` response during the MCP handshake. on stdin. FastMCP replies with the server's name, version, and capabilities, then emits `initialized`. 2. **Tool discovery.** The agent immediately calls `tools/list`. FastMCP - returns the five schemas so the agent knows what it can invoke. Any + returns the seven schemas so the agent knows what it can invoke. Any system-level prompt instructions declared on the `FastMCP()` constructor ride along here — armillary's say *"ALWAYS call `armillary_next` at the very start of every conversation,"* which is diff --git a/src/armillary/code_index.py b/src/armillary/code_index.py index 231a0e1..0eb18b3 100644 --- a/src/armillary/code_index.py +++ b/src/armillary/code_index.py @@ -294,13 +294,13 @@ def _sanitize_fts_query(query: str) -> str: Each whitespace-delimited token becomes a prefix-matched quoted phrase (``"token" *``). Prefix matching widens the recall: a - query like ``linked_flow`` now matches ``linked_flow_policy``, - ``linked_flow_service``, etc., which FTS5's simple tokenizer + query like ``user_session`` now matches ``user_session_policy``, + ``user_session_service``, etc., which FTS5's simple tokenizer would otherwise treat as distinct tokens. CamelCase vs snake_case is a separate concern — FTS5's simple - tokenizer does not split ``LinkedFlow`` into ``linked`` + ``flow``, - so users searching for ``linked_flow`` will not hit ``LinkedFlow``. + tokenizer does not split ``UserSession`` into ``user`` + ``session``, + so users searching for ``user_session`` will not hit ``UserSession``. Callers who care should submit both variants or use the dedicated matcher (future work). diff --git a/src/armillary/mcp_tools.py b/src/armillary/mcp_tools.py index 3257c7c..d9350dd 100644 --- a/src/armillary/mcp_tools.py +++ b/src/armillary/mcp_tools.py @@ -278,8 +278,8 @@ def armillary_context(project_name: str) -> str: or "what's the state of X". NOT auto-triggered on directory change. Examples: - - armillary_context("pdf_to_quiz") → branch, 1 dirty file, last 5 commits - - armillary_context("speak-faster") → dormant, last commit 3 months ago + - armillary_context("my-saas-app") → branch, 1 dirty file, last 5 commits + - armillary_context("old-prototype") → dormant, last commit 3 months ago """ from armillary.context_service import get_context diff --git a/src/armillary/revive_enhanced.py b/src/armillary/revive_enhanced.py index 2c26b28..1b219b7 100644 --- a/src/armillary/revive_enhanced.py +++ b/src/armillary/revive_enhanced.py @@ -2,8 +2,7 @@ Query strategy v0.1: the project name is the only signal we send to ``steal()``. Empirically this gives 5–8 cross-repo matches for typical -underscore / dash naming (``pdf_to_quiz``, ``reddit_promo_planner``, -``claude-code-project-boundary``) because FTS5 tokenises the separators +underscore / dash naming because FTS5's tokeniser splits separators into meaningful sub-tokens. An earlier draft also folded in the last commit subject, but a 6-token AND query returns zero hits in practice. Single-token ranking is the simplest thing that delivers real value.