How to speed up word2vec similarity calculation?

Question

I trained a Word2Vec model using Gensim, and I have two sets of words:

S1 = {'','','' ...}
S2 = {'','','' ...}

for each word w1 in S1, I want to find top 5 words that are most similar to w1. I am currently doing this way:

model = w2v_model
 word_similarities = {}
 for w1 in S1:
    similarities = {}
    for w2 in S2:
       if w1 in model.wv and w2 in model.wv:
           similarity = model.similarity(w1, w2)
           similarities[w2] = similarity
    word_similarties[w1] = similarities

Then for each word in word_similarities, I can get the top N from its dict values. If S1 and S2 are large, this becomes very slow.

Is there a quicker way to compute large pairs of words in Word2Vec, either in genism or tensorflow?

How to speed up word2vec similarity calculation?

Answers (1)

Related Questions