Skip to content

docs: scrub real project names + sync MCP tool list#40

Merged
justi merged 1 commit into
mainfrom
docs/mcp-tool-list-and-steal-docstring
May 4, 2026
Merged

docs: scrub real project names + sync MCP tool list#40
justi merged 1 commit into
mainfrom
docs/mcp-tool-list-and-steal-docstring

Conversation

@justi
Copy link
Copy Markdown
Owner

@justi justi commented May 4, 2026

Summary

Two cleanup chores rolled into one branch — both deterministic, both surfaced during the recent ground-truth audit of armillary_revive:

  1. docs/mcp.md was advertising five MCP tools when the codebase ships seven (armillary_steal and armillary_revive were missing from the table and the lifecycle note still read "five schemas").
  2. Several shipped Python files quoted real project names in docstrings/examples, which violates the private-data rule. Replaced with the recommended generic placeholders (my-saas-app, old-prototype, user_session), or dropped the names entirely where they were illustrative rather than required.

Side benefit

Real project names in indexed Python source meant armillary_revive's STEAL_HITS could surface armillary's own source as a "cross-repo match" for any user project sharing one of those name tokens. Scrubbing the names removes that class of false positive.

Files

File Change
docs/mcp.md tool table gains armillary_steal and armillary_revive rows; "five schemas" → "seven schemas"
src/armillary/mcp_tools.py armillary_context examples use generic placeholders
src/armillary/code_index.py FTS-tokeniser docstring uses synthetic identifier instead of real project name
src/armillary/revive_enhanced.py module docstring drops named examples
CHANGELOG.md revive write-up drops named examples

Test plan

  • .venv/bin/python -m pytest -q — 457 passed
  • .venv/bin/ruff check . + ruff format --check . — clean
  • grep audit confirms zero remaining real project names across src/, docs/mcp.md, README.md, CHANGELOG.md
  • CI green on Python 3.11 + 3.12

🤖 Generated with Claude Code

Two cleanup chores rolled into one branch:

1. `docs/mcp.md` — the tool table only listed five tools and the
   lifecycle note still said "five schemas". The codebase has shipped
   `armillary_steal` and `armillary_revive` since, so the public MCP
   reference was lying about its surface. Add the two missing rows
   and fix the count.

2. Real project names had crept into shipped docstrings/comments,
   violating the project's private-data rule (no real project names
   in public content):

   - `mcp_tools.py` `armillary_context` examples used a real
     project name and a real dormant one. Replace with the generic
     placeholders the rule explicitly recommends (`my-saas-app`,
     `old-prototype`).
   - `code_index.py` FTS-tokeniser docstring quoted a real project
     name three times to illustrate prefix matching. Substitute a
     synthetic identifier (`user_session`) that shows the same
     behaviour without leaking a private name.
   - `revive_enhanced.py` module docstring listed three real project
     names as evidence for single-token query strategy. Drop the
     names; the empirical claim survives without them.
   - `CHANGELOG.md` quoted two of the same names in the v0.1
     write-up. Same fix.

Side benefit of #2: removing real names from indexed Python files
also removes a class of false-positive matches in `armillary_revive`
STEAL_HITS — when a user revives a project whose name token appears
in armillary's own docstrings, the code index used to surface
armillary's source as a "cross-repo match", which was noise from a
private-data leak rather than useful cross-repo signal.
Copilot AI review requested due to automatic review settings May 4, 2026 13:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR is a low-risk documentation/privacy cleanup for armillary: it removes real project-name examples from shipped source/docs and updates the MCP documentation to reflect the current tool surface.

Changes:

  • Replaced named project examples in Python docstrings with generic placeholders or removed them entirely.
  • Updated docs/mcp.md to add the missing armillary_steal and armillary_revive tool rows and revised the lifecycle note from five to seven schemas.
  • Scrubbed the revive-related changelog text to avoid embedding real project names.

Assessment: the intent is clear and the scope is small, but docs/mcp.md still contains one stale lead-in earlier in the same section that says “Five tools are registered …”, so the MCP docs are not fully synchronized yet.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/armillary/revive_enhanced.py Removes concrete project-name examples from the module docstring.
src/armillary/mcp_tools.py Replaces armillary_context() docstring examples with generic placeholder names.
src/armillary/code_index.py Swaps FTS docstring examples to use synthetic identifiers.
docs/mcp.md Expands the MCP tool table and updates the schema-count wording.
CHANGELOG.md Rewords revive changelog text to avoid named project examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/mcp.md
Comment on lines +83 to +84
| `armillary_steal` | `(query: str, limit: int = 5, language: str \| None) → str` | <100 ms |
| `armillary_revive` | `(project_path: str) → str` | <500 ms (revive subprocess + steal) |
@justi justi merged commit a77918b into main May 4, 2026
7 checks passed
@justi justi deleted the docs/mcp-tool-list-and-steal-docstring branch May 4, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants