z3r0
z3r0

Reputation: 3

How can I get the similarity matrix from minhash LSH?

I have read many tutorials and tried a number of minhash LSH, but it cannot generate the similarity matrix, instead it returns just similar data which exceeds the threshold. How can I generate it? My intention is to use the LSH results for clustering.

Upvotes: 0

Views: 872

Answers (1)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77495

The whole point of LSH is to avoid pairwise distances, because that does not scale.

If you then put the data into a distance matrix, you get all the scalability problems again!

Instead consider an algorithm like DBSCAN clustering. It doesn't need a distance matrix, only neighbors at distance epsilon.

Upvotes: 1

Related Questions