StackOverflow Questions for Tag: minhash

RMurphy
RMurphy

Reputation: 303

Which optimizations are being done in spark's MinHashLSH.. banding?

Score: 0

Views: 19

Answers: 0

Read More
Ramji
Ramji

Reputation: 75

Near Similarity and duplication detection

Score: 0

Views: 28

Answers: 0

Read More
king Carrey
king Carrey

Reputation: 3

How to cascade AND-OR with OR-AND structure when using Minhash-LSH

Score: 0

Views: 38

Answers: 0

Read More
Tom Lous
Tom Lous

Reputation: 2909

Strange performance issue Spark LSH MinHash approxSimilarityJoin

Score: 7

Views: 2191

Answers: 1

Read More
Atish Kathpal
Atish Kathpal

Reputation: 701

Can you suggest a good minhash implementation?

Score: 21

Views: 28760

Answers: 6

Read More
Nithin Varghese
Nithin Varghese

Reputation: 922

ufunc 'bitwise_and' not supported for the input types Minhash

Score: 2

Views: 476

Answers: 1

Read More
Faizan Ul Haq
Faizan Ul Haq

Reputation: 1

Using DataSketch to find similarity between 3 audios using mfccs

Score: 0

Views: 110

Answers: 0

Read More
thijsvdp
thijsvdp

Reputation: 482

All executors dead MinHash LSH PySpark approxSimilarityJoin self-join on EMR cluster

Score: 2

Views: 2557

Answers: 4

Read More
Kipras Bielinskas
Kipras Bielinskas

Reputation: 147

How to use Solr MinHashQParser

Score: 0

Views: 131

Answers: 1

Read More
C. John
C. John

Reputation: 154

One-hot encoding minHashed genomes

Score: 2

Views: 99

Answers: 0

Read More
Marsellus Wallace
Marsellus Wallace

Reputation: 18611

What value to use for numHashTable in Spark LSH by Uber?

Score: 4

Views: 1880

Answers: 1

Read More
Tanmay Sinha
Tanmay Sinha

Reputation: 1

Generate sparse vector for all the column values in spark dataframe

Score: 0

Views: 512

Answers: 1

Read More
Charmander_
Charmander_

Reputation: 90

Optimal way for calculating Weighted Jaccard index in Python

Score: 1

Views: 1900

Answers: 1

Read More
Sheel
Sheel

Reputation: 1030

UDF to check is non zero vector, not working after CountVectorizer through spark-submit

Score: 1

Views: 1027

Answers: 1

Read More
pratik
pratik

Reputation: 1

How to choose Elastiknn LSH Jaccard similarity index parameters L and k ? In my case I have minhash size = 100, and jaccard Similarity = 0.8

Score: 0

Views: 717

Answers: 1

Read More
ianux22
ianux22

Reputation: 405

Questions about LSH (Locality-sensitive hashing) and minihashing implementation

Score: -1

Views: 379

Answers: 1

Read More
coderboi
coderboi

Reputation: 351

Compare list to every element in a pyspark column

Score: 1

Views: 987

Answers: 1

Read More
secretive
secretive

Reputation: 2104

Number of pairs in calculating Jaccard distance using PySpark are less than they should be

Score: 1

Views: 1262

Answers: 2

Read More
Galuoises
Galuoises

Reputation: 3283

Transform a dataframe for the minHashLSH in spark

Score: 0

Views: 227

Answers: 1

Read More
zyxue
zyxue

Reputation: 8908

Is the number of rows always 1 in each band in the Spark implementation of MinHashLSH

Score: 1

Views: 934

Answers: 1

Read More
PreviousPage 1Next