Reputation: 11
I'm using Gensim 4.0 to store vectors in Doc2VecKeyedVectors to perform similarity lookups, but I'm getting an error.
Here is some sample code:
model = <load a Doc2Vec model>
corpus = <load an object which returns key/words pairs>
kv = Doc2VecKeyedVectors(vector_size=50)
for key, words in corpus:
vector = model.infer_vector(words)
kv.add_vector(key, vector)
test_words = ['word1', 'word2', ...]
vector = model.infer_vector(test_words)
sims = kv.similar_by_vector(vector, topn=200)
The call to similar_by_vector() throws a "ValueError: too many values to unpack (expected 2)" on line 758 of keyedvectors.py in most_similar() method.
I walked through the source code, and it looks like it's expecting the key to be passed in with the vector, which seems odd based on the method signature.
Any ideas as to what I'm doing wrong?
Upvotes: 0
Views: 35
Reputation: 11
I figured out the problem. The code sample I gave left out the fact that the call to infer_vector() was actually a call to a remote server, which was returning a list of floats, not an ndarray. Well, I never converted it back to an ndarray before calling similar_by_vector(), and that was the cause of the problem.
Upvotes: 1