A Python tool to archive your Feedbin starred articles and Pages feed entries with automatic AI-powered summaries using Kagi's Universal Summarizer.
- Fetches entries from your Feedbin "Pages" feed and starred articles
- Dual summary system:
- π Feedbin's original summaries from the RSS feed
- β¨ AI-generated TL;DR using Kagi's Universal Summarizer API
- Dual archiving system:
- π Reader View archives from Feedbin's extracted content (clean, reader-friendly HTML)
- ποΈ Web archives using monolith (complete page with all resources, no video/audio/JS)
- Beautiful web interface - Browse and search your entries with a Feedbin-inspired dark theme
- Merges new entries with existing data while preserving history
- Automatically backs up data with timestamps before updates
- Outputs structured JSON with entry metadata and summaries
- Simple TOML configuration (just output directory and log level)
This project uses uv for dependency management:
uv syncYou'll also need monolith installed for archiving web pages:
# macOS
brew install monolith
# Linux (cargo)
cargo install monolith
# Or download pre-built binaries from GitHub releasesexport FEEDBIN_EMAIL='[email protected]'
export FEEDBIN_PASSWORD='your-password'
export KAGI_API_KEY='your-kagi-api-key'Note: If KAGI_API_KEY is not set, the script will still run but won't generate AI TL;DR summaries (Feedbin summaries will still be available).
On first run, a config.toml file will be created automatically with defaults:
output_dir = "./dist"
log_level = "INFO"You can customize:
output_dir: Where to store data (default:./dist)log_level: Logging verbosity -"DEBUG","INFO"(default),"WARNING","ERROR", or"CRITICAL"
Run the script:
uv run python breadcrumbs.pygraph TD
A[Start] --> B[Load config.toml]
B --> C[Fetch Feedbin Pages Feed]
B --> D[Fetch Starred Entries]
C --> E[Merge & Deduplicate]
D --> E
E --> F[Load existing data.json]
F --> G{Any new entries?}
G -->|No| M[Generate index.html]
G -->|Yes| H[Backup data.json]
H --> I[Process Each New Entry]
I --> J[Generate AI Summary<br/>Kagi API]
J --> K[Create Web Archive<br/>monolith]
K --> L[Create Reader View<br/>Feedbin content]
L --> I
I -->|All processed| M[Generate index.html]
M --> N[Done!]
style J fill:#a855f7,stroke:#7c3aed,color:#fff
style K fill:#0867e2,stroke:#0651b5,color:#fff
style L fill:#0867e2,stroke:#0651b5,color:#fff
The script will:
- Load configuration from
config.toml - Fetch all entries from your Feedbin "Pages" feed
- Fetch all your starred articles
- Merge entries (marking duplicates appropriately)
- For each new entry:
- Generate AI summary via Kagi API (if API key provided)
- Archive the full web page using monolith
- Create a content archive from Feedbin's extracted content
- Save to
dist/data/data.jsonwith backup of previous version - Generate a beautiful HTML interface at
dist/index.html
Open dist/index.html in your browser to access the web interface:
open dist/index.htmlThe interface includes:
- Real-time search - Filter entries by title, URL, or summary
- Type filters - View all entries, just pages (π), or just starred (β)
- Dual summaries - π Feedbin summaries and β¨ AI TL;DR with distinct gradients
- Expandable summaries - Long summaries collapse to 3 lines with a "Show more +" button
- Quick access - Links to original URLs, reader view (π), and web archives (ποΈ)
- Responsive design - Works great on desktop and mobile
- Keyboard shortcuts - Press Cmd/Ctrl + K to focus search
Data is saved in dist/data/data.json:
{
"generated_at": "2024-01-15T10:30:00.123456",
"entries": [
{
"id": 12345,
"title": "Article Title",
"url": "https://example.com/article",
"published": "2024-01-15T08:00:00.000000Z",
"created_at": "2024-01-15T08:05:00.000000Z",
"entry_type": "page",
"content": "Feedbin-extracted article content (HTML)...",
"summary": "Original summary from the RSS feed...",
"tldr": "AI-generated TL;DR from Kagi...",
"archive_file": "archive/12345_example.com_article.html",
"content_archive_file": "archive/content-12345_example.com_article.html"
}
]
}page: Entry from your "Pages" feedstar: Starred entry (or starred entry that's also in Pages feed)
Breadcrumbs creates two types of archives for each entry:
Generated from Feedbin's extracted content field:
- Stored in
dist/archive/directory - Filename format:
content-{entry_id}_{url_slug}.html - Clean, reader-friendly HTML with article content
- Styled with the same Feedbin-inspired dark theme as the main interface
- Fast to load and easy to read
- The
content_archive_filefield contains the relative path (e.g.,archive/content-12345_example.com_article.html)
Generated using monolith:
- Stored in
dist/archive/directory - Filename format:
{entry_id}_{url_slug}.html - Complete web page with all CSS, images, and resources embedded inline
- Preserves the exact look of the original page
- Larger file sizes due to embedded resources
- The
archive_filefield contains the relative path (e.g.,archive/12345_example.com_article.html)
.
βββ config.toml # Configuration file
βββ breadcrumbs.py # Main script
βββ templates/
β βββ index.html # Jinja2 template for web interface
β βββ entry.html # Jinja2 template for content archives
βββ dist/
β βββ index.html # Generated web interface
β βββ data/
β β βββ data.json # Current data
β β βββ data-YYYYMMDD-HHMMSS.json # Timestamped backups
β βββ archive/
β β βββ content-{entry_id}_{url_slug}.html # Content archives (Feedbin extracted)
β β βββ {entry_id}_{url_slug}.html # Full page archives (monolith)
β βββ logs/
β βββ breadcrumbs-YYYYMMDD-HHMMSS.log # Execution logs
- Feedbin API v2 - RSS feed management
- Kagi Universal Summarizer API - AI-powered summarization
- monolith - Web page archiving with embedded resources
- Python 3.14+
- monolith - Command-line tool for archiving web pages
- Dependencies managed via
pyproject.toml(installed withuv sync)
Created by Justin Pecott with significant contributions from Claude Code (Anthropic). The beautiful web interface, archiving functionality, and overall architecture were developed collaboratively through an iterative design process.
Special thanks to:
