Audit Claude Code skills for duplicates and similarity.
- Skill Discovery: Automatically finds all
SKILL.mdfiles in Claude Code plugin marketplaces and user skills directories - Metadata Parsing: Extracts frontmatter metadata (name, description, triggers) from skill files
- Embedding Generation: Uses sentence-transformers to create semantic embeddings of skill content
- Similarity Detection: Finds potentially duplicate skills using cosine similarity
- LLM Evaluation: Uses Ollama to evaluate candidate duplicates for functional equivalence
- Markdown Reports: Generates human-readable reports grouped by purpose
- Python 3.14+
- Ollama running locally with a model available
# Clone the repository
git clone https://github.com/katmandoo212/skill_auditor.git
cd skill_auditor
# Install dependencies with uv
uv sync
# Or with pip
pip install -e .# Scan default plugin cache
skill-auditor
# Scan custom paths
skill-auditor -p /path/to/plugins -p /another/path
# Specify output file
skill-auditor -o my_report.md
# Adjust similarity threshold (default: 0.8)
skill-auditor -t 0.75
# Use a different Ollama model
skill-auditor -m llama3.2
# Verbose output
skill-auditor -v| Option | Short | Default | Description |
|---|---|---|---|
--path |
-p |
~/.claude/plugins/marketplaces, ~/.claude/skills |
Paths to scan for skills (can be specified multiple times) |
--output |
-o |
skill_audit_report.md |
Output markdown file path |
--threshold |
-t |
0.8 |
Embedding similarity threshold (0.0-1.0) |
--model |
-m |
glm-5:cloud |
Ollama model for LLM evaluation |
--max-candidates |
10 |
Maximum candidates per skill to send to LLM | |
--verbose |
-v |
False |
Enable verbose logging |
--version |
Show version and exit |
The tool generates a markdown report with:
- Total skills scanned
- Number of duplicate groups found
- Skills with potential duplicates
- Similarity threshold and model used
Each group contains:
- Purpose tag (e.g., "code-review", "brainstorming")
- Confidence level (high/medium/low)
- Notes explaining the similarity
- Table of duplicate skills with plugin and path
A complete table of all discovered skills with their plugins and descriptions.
- Discovery: Scans directories for
skills/*/SKILL.mdpattern - Parsing: Extracts YAML frontmatter and content from each skill file
- Embedding: Generates semantic embeddings using
all-MiniLM-L6-v2 - Similarity: Computes cosine similarity between all skill pairs
- Candidate Selection: Skills with similarity >= threshold become candidates
- LLM Evaluation: Sends skill + candidates to Ollama for functional analysis
- Report Generation: Creates markdown report grouped by purpose
The tool includes robust error handling for:
- File Read Errors: Handles missing files, permission errors, and encoding issues
- YAML Parsing: Logs warnings for malformed frontmatter, continues with empty metadata
- LLM Connection: Retries failed Ollama requests with exponential backoff (3 attempts)
- JSON Parsing: Multi-pass extraction from LLM responses with brace-matching
- Thread Safety: Thread-safe model initialization using double-checked locking
# Install dev dependencies
uv sync
# Run tests
uv run pytest
# Run tests with verbose output
uv run pytest -v
# Run specific test file
uv run pytest tests/test_integration.pyskill_auditor/
├── __init__.py # CLI entry point and main orchestration
├── __main__.py # Package runner (python -m skill_auditor)
├── config.py # Configuration defaults and constants
├── models.py # Data models (Skill, DuplicateGroup, SimilarityCandidate)
├── scanner.py # Skill discovery and parsing with error handling
├── embeddings.py # Thread-safe embedding generation with DI support
├── evaluator.py # LLM evaluation with retry logic and JSON parsing
└── reporter.py # Markdown report generation
tests/
├── test_cli.py # CLI tests
├── test_config.py # Configuration tests
├── test_embeddings.py # Embedding and similarity tests
├── test_evaluator.py # LLM evaluation and JSON parsing tests
├── test_integration.py # End-to-end integration tests
├── test_models.py # Data model tests
├── test_reporter.py # Report generation tests
└── test_scanner.py # Discovery and parsing tests
MIT License