Skip to content

How to Use MinHash and MinHashLSH to Identify Comprehensive Documents and Partial Matches? #216

@aplmikex

Description

@aplmikex

If I have one document, and another document contains a portion of this document along with some text like advertisements, but only a small amount, can MinHash retain only the most comprehensive document? Can MinHashLSH identify partial documents as accurately as possible if such documents are distributed among a large number of files?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions