Download and cache full page content from URLs for complete information retrieval without summarization loss.
| Feature | Built-in WebFetch | Fetch-Full-Content |
|---|---|---|
| Information Retrieved | 30-80% (summarized by subagent) | 100% (full content) |
| Caching | No (refetch with 5min request caching) | Yes (filesystem cached) |
| Format | Markdown | Markdown |
| Best For | Quick lookups, general questions | Building agent skills, comprehensive analysis |
When to use Fetch-Full-Content:
- Writing agent skills based on documentation (need 100% accuracy)
- Analyzing complete API references or specifications
- Caching official docs for repeated analysis
- Building training data from trusted sources
When to use Built-in WebFetch:
- Quick information lookup
- General web browsing
- Untrusted or unknown sources
Fetch-Full-Content includes basic filtering to remove common hidden content injection vectors (comments, invisible text, display:none elements), but this is NOT comprehensive protection.
Malicious websites can still embed instructions to manipulate Claude's behavior through other means not caught by basic filtering.
ONLY use on:
- ✅ Official documentation sites (docs.anthropic.com, angular.dev, etc.)
- ✅ Trusted third-party sources you control
- ✅ Content under your organization's domain
NEVER use on:
- ❌ Untrusted websites or user-generated content
- ❌ Public forums, comment sections, or social media
- ❌ Potentially malicious sources
Security filtering includes:
- Removes HTML comments
- Filters elements with
display: noneorvisibility: hidden - Removes text with font-size < 6px
- Filters elements with opacity < 10%
- Removes elements with very low alpha channel colors (< 10%)
If unsure about a source, use the built-in WebFetch tool instead - it includes comprehensive safeguards for untrusted content.
/plugin install fetch-full-content@claude-code-toolkit# Single URL
/fetch-full-content --folder docs https://angular.dev/essentials/signals
# Multiple URLs
/fetch-full-content --folder docs https://angular.dev/essentials/signals https://angular.dev/guide/directives
# Batch from file
/fetch-full-content --folder docs $(cat urls.txt)Build comprehensive agent skills:
# Download complete Angular documentation for skill development
/fetch-full-content --folder angular-docs \
https://angular.dev/guide/signals \
https://angular.dev/guide/directives \
https://angular.dev/guide/dependency-injectionAnalyze complex topics:
# Get all pricing and feature documentation
/fetch-full-content --folder product-info \
https://service.com/pricing \
https://service.com/features \
https://service.com/billingCache official documentation:
# Cache official docs for repeated analysis
/fetch-full-content --folder claude-docs \
https://docs.anthropic.com/en/docs/about-claude/models-overview \
https://docs.anthropic.com/en/docs/build-with-claude/tool-use/fetch-full-content
- Downloads full page content from URLs
- Converts to clean markdown
- Caches to filesystem for reuse
- Removes navigation, ads, and scripts
Returns clean markdown files with:
- All page content preserved
- Navigation and ads removed
- HTML attribute noise stripped
- Code blocks preserved with syntax highlighting
- Links maintained as reference
Example output:
docs/angular-dev_signals.md
docs/angular-dev_directives.md
docs/angular-dev_dependency-injection.md
- No summarization = no information loss
- Every section, example, and detail preserved
- Better foundation for building accurate skills
- Downloaded files persist in specified folder
- Reuse across multiple sessions
- Avoid redundant network requests
- Analyze same documentation with different approaches
- Removes navigation and headers
- Strips ads and popups
- Eliminates JavaScript UI code
- Preserves actual content
- Filters hidden content (comments, invisible text, display:none elements)
- Detects dynamic content (< 500 chars detected as JS-rendered)
- Automatically retries with Playwright
- Route blocking removes images, styles, fonts to speed up rendering
requests
beautifulsoup4
markdownify
playwright (optional, for JS rendering)
- Bash: Execute download script
- Read: Load existing files
- Glob: Find cached files
- ✅ Cache official documentation for reuse
- ✅ Download complete topics before building skills
- ✅ Use cached files for repeated analysis
- ✅ Download from trusted sources only
- ✅ Organize by topic in separate folders
- ❌ Use on untrusted websites (use built-in WebFetch instead)
- ❌ Ignore security warnings
- ❌ Assume partial downloads are complete
- ❌ Share downloaded content from restricted sources
- ❌ Rely on summarization when accuracy matters
URL → Subagent summarization → 30-80% of content → Claude
Problem: Content loss, incomplete information, not cacheable
URL → Download + clean HTML → Markdown → Filesystem cache → Claude
Benefit: 100% content, reusable, better for skill development
See CHANGELOG.md for complete version history.
See root LICENSE for details.
- Issues: Report bugs or request features
- Repository: claude-code-toolkit
Author: Thore Höltig