Skip to content

Fix InsightEngine URL-less result deduplication#689

Open
3em0 wants to merge 1 commit into
666ghj:mainfrom
3em0:fix/issue-688-insight-dedup-key
Open

Fix InsightEngine URL-less result deduplication#689
3em0 wants to merge 1 commit into
666ghj:mainfrom
3em0:fix/issue-688-insight-dedup-key

Conversation

@3em0

@3em0 3em0 commented Jun 3, 2026

Copy link
Copy Markdown

Summary

  • replaces the URL-less result dedupe key that previously used title_or_content[:100]
  • preserves URL-based deduplication for results that have a URL
  • adds regression coverage for URL-less results that share the same first 100 characters but differ in full content

Fixes #688

Tests

  • python -m pytest tests/test_insight_deduplication.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

InsightEngine deduplication can drop distinct URL-less results

1 participant