Finding near-duplicates with Jaccard similarity and MinHash - Made of Bugs
https://blog.nelhage.com/post/fuzzy-dedup/
46800306