feat: Mark enhancement — AI analysis, share, export (#12)#41
Closed
jessie-coco wants to merge 20 commits intokevinho:mainfrom
Closed
feat: Mark enhancement — AI analysis, share, export (#12)#41jessie-coco wants to merge 20 commits intokevinho:mainfrom
jessie-coco wants to merge 20 commits intokevinho:mainfrom
Conversation
Detailed 4-phase roadmap for ClawFeed v0.8 → v2.0 upgrade: - Phase 1 (v0.9–v1.0): Data pipeline + personalization - Phase 2 (v1.0–v1.5): Multi-channel push + AI interaction - Phase 3 (v1.5–v2.0): Platform API + Source Market - Phase 4 (v2.0+): Monetization + enterprise features Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat: raw_items collection pipeline (Phase 1) Cherry-pick from kevinho/clawfeed PR #15. Decouples source collection from digest generation: - Add raw_items table (migration 010) with dedup via UNIQUE constraint - Add collector.mjs: standalone fetcher for RSS, HN, Reddit, GitHub Trending, Website sources with SSRF protection and concurrency pool - Add db.mjs CRUD: insertRawItemsBatch, listRawItems, listRawItemsForDigest, getRawItemStats, cleanOldRawItems, touchSourceFetch, recordSourceError, getSourcesDueForFetch - Add API endpoints: GET /api/raw-items, /api/raw-items/stats, /api/raw-items/for-digest - Auto-pause sources after 5 consecutive fetch failures - 30-day TTL cleanup for old raw_items Closes #2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address security review findings from Boot + Lucy High: - Fix SSRF DNS rebinding (TOCTOU): pin resolved IP via custom lookup callback so http.get uses the same IP that was validated - Fix IPv6-mapped IPv4 bypass: extract and validate the embedded IPv4 from ::ffff:x.x.x.x addresses - Add source-level permission check: /api/raw-items and /api/raw-items/stats now scoped to user's subscribed sources only Medium: - Replace DJB2 32-bit hash with sha256 for dedup_key (lower collision risk) - Add content:encoded support in RSS parser - Read COLLECTOR_INTERVAL/CONCURRENCY from process.env (consistency) Other: - Add graceful shutdown (SIGTERM/SIGINT) for --loop mode - Add resp.setEncoding('utf8') to prevent implicit Buffer→string conversion Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Jessie <jessie@coco.site> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add admin API for collector management: - GET /api/collect/status — returns sources_due, sources_active, sources_paused, last_fetch_at, raw_items_total, raw_items_24h - POST /api/collect/trigger — spawns one-shot collection process (API key auth required for both endpoints) Add getCollectorStatus() DB helper for aggregated collector metrics. Add 7 new E2E test assertions for collector endpoints (73 total). The collector.mjs --loop mode (from PR #6) provides the core scheduling engine. This PR adds the API layer for monitoring and manual triggers, completing the cron integration needed for PM2-managed deployment. Closes #4 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add LLM-powered per-user digest generation engine. Each user gets digests based on their subscribed sources instead of seeing a single global digest. Changes: - New src/generator.mjs: digest generation engine with OpenAI-compatible LLM API, per-user and system digest modes, CLI interface - db.mjs: createDigest now accepts user_id; new functions for getUsersDueForDigest, getActiveSubscriptionSourceIds, getLastDigestTime - server.mjs: GET /api/digests returns personalized digests for logged-in users; new GET /api/digest-status endpoint - Migration 011: index on digests(user_id, type, created_at) - .env.example: LLM config vars (LLM_API_URL/KEY/MODEL/TIMEOUT) - package.json: generate and generate:daily scripts Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…assignment
- callLlm: clearTimeout in resp.on('end') handler (timer leak)
- GET /api/digests: new users with no personal digests fall back to system digests
- generator.mjs: remove dead db2 = db assignment
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Generator engine (src/generator.mjs): LLM-powered per-user digest from subscriptions - System digest fallback for new users (user_id=NULL, public sources) - DB: createDigest with user_id, getUsersDueForDigest, getActiveSubscriptionSourceIds - API: GET /api/digests (personal + system fallback), GET /api/digest-status - Migration 011: index on digests(user_id, type, created_at) - Config: LLM_API_URL/KEY/MODEL/TIMEOUT, GENERATOR_MAX_ITEMS Reviewed-by: jessie-coco, boot-coco Closes #3
- POST /api/chat endpoint with LLM integration using digest content as context - Chat bubble UI (bottom-right) with expandable chat box - Conversation history persisted in sessionStorage - Dark/light theme support, mobile responsive - E2E tests for chat API (sections 17.1-17.4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sts (#10) - Add email_preferences + email_log DB tables (migration 012) - Add responsive HTML email template (dark header, clean layout) - Add emailer.mjs: Resend-based sender with retry logic, dry-run mode - Add server endpoints: GET/PUT /api/email/preferences, GET/POST /api/email/unsubscribe - Add 11 E2E test cases (all passing) - Add npm scripts: email, email:daily, email:weekly - Update .env.example with RESEND_API_KEY, EMAIL_FROM, BASE_URL Pending: Resend account setup + coco.xyz domain DNS verification. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add migration 012_telegram.sql (telegram_links, link_codes, push_log) - Add DB functions for Telegram linking, preferences, push logging - Create src/telegram.mjs — long-polling bot with /start, /digest, /stop, /settings - Add server endpoints: GET/POST/PUT/DELETE /api/settings/telegram - Hook generator.mjs to dispatch push after digest creation (fire-and-forget) - Add TELEGRAM_BOT_TOKEN to .env.example - Add npm telegram script Link flow: user /start in bot → gets 6-digit code → enters in web UI → account linked. Push: after digest generated → fork telegram.mjs --push → sends to subscribers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
H1: Escape HTML entities before markdown→HTML conversion, sanitize
link hrefs to http/https only (prevents XSS from RSS content)
H2: GET /api/email/unsubscribe now shows confirmation page only,
POST executes the actual unsubscribe (prevents email client
link prefetchers from auto-unsubscribing users)
M1: Remove dead variable in sendDigestToUser
M2: upsertEmailPreference now returns fresh data after update
M4: .env.example BASE_URL defaults to localhost
Added 3 new E2E tests (18.12-18.14) for prefetcher safety verification.
Updated EMAIL_FROM to hxa.net per Kevin's domain decision.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- H1: Require auth + add rate limiting (20 req/min per user) on /api/chat - H2: Add 64KB response size limit + 8K content cap on LLM responses - H3: digest_id input validation (numeric only) - H4: Escape error messages in chat UI (XSS fix) - H5: Sanitize role from sessionStorage history (DOM injection fix) - M1: Remove failed user message from history on connection error - M3: Clear chatDigestId when navigating away from digest viewer - Updated E2E tests: auth requirement + 5 test cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ombies C-1 (Critical): pushDigestToTelegram now checks digest.user_id matches subscriber — prevents sending one user's personalized digest to all Telegram subscribers. System digests (user_id=null) still go to everyone. S-1 (High): Link code brute-force mitigation — tracks failed attempts per code, invalidates after 5 wrong guesses within TTL window. C-2 (Medium): Generator fork uses stdio:'ignore' instead of 'pipe' to prevent pipe buffer exhaustion causing zombie child processes. C-3 (Medium): Digest messages sent without parse_mode to avoid Telegram API rejecting LLM-generated content with unescaped markdown chars. S-3 (Medium): saveTelegramLink uses ON CONFLICT to preserve existing enabled/digest_types preferences when re-linking. T-1: Added 19 unit tests covering all fixes (telegram.test.mjs). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
H-1 (High): Block all bot commands in group chats — only respond in private DMs. Prevents link code exposure and digest data leaks in groups. H-2 (High): Store chat_username in telegram_links table. Previously the username parameter was accepted but silently discarded. H-3 (High): Remove internal chat_id from POST /link API response. M-3 (Medium): Add 30s timeout to Telegram API calls (getUpdates gets its own longer timeout matching POLL_TIMEOUT + 10s buffer). M-4 (Medium): /digest command now filters out system digests (user_id IS NULL) to match push notification behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: Email Digest push (#10)
Merge main (post-PR #15) into feat/telegram-push. Resolved conflicts: - db.mjs: keep both email (012) and telegram (013) migrations + functions - server.mjs: keep both email and telegram API endpoints - Migration renamed: 012_telegram.sql → 013_telegram.sql Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: Telegram Bot push notifications (#9)
feat: AI Chat Widget (#11)
- Migration 014: adds analysis, share_token, tags, digest_id columns to marks - AI analysis endpoint (POST /api/marks/:id/analyze): LLM generates summary, topic tags, significance. Tags stored for digest preference tuning. - Share links (POST/DELETE /api/marks/:id/share): generate/revoke public share tokens. Public endpoint returns safe fields only (no user_id). - Export (GET /api/marks/export): markdown and JSON formats with date filtering - Digest preference integration: generator.mjs injects user's bookmark topic tags into LLM prompt to prioritize relevant content - Shared callLlmApi helper extracted for reuse across endpoints - 15 new e2e tests covering all new endpoints + auth + isolation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements issue #12 — Mark (bookmark) enhancement with AI-powered features.
New capabilities:
POST /api/marks/:id/analyze— LLM generates structured analysis (summary, key topics, significance, related areas) + extracts topic tagsPOST /api/marks/:id/sharecreates a public share token;DELETErevokes it. Public endpoint (GET /api/marks/shared/:token) returns safe fields only (no user_id or internal IDs)GET /api/marks/export— Markdown (default) or JSON format with optional date range and status filtersChanges:
analysis,analyzed_at,share_token,digest_id,tagscolumns to marks tablesrc/db.mjs: 8 new DB functions (getMark, updateMarkAnalysis, setMarkShareToken, getMarkByShareToken, revokeMarkShare, listMarksForExport, getUserMarkTopics)src/server.mjs: 6 new API endpoints + sharedcallLlmApihelpersrc/generator.mjs: bookmark topic injection into digest generation prompttest/e2e.sh: 15 new tests (all passing)Test plan
🤖 Generated with Claude Code