Reputation: 601
I tried to load the GoogleNews-vectors-negative300.bin and try the predict_output_word method,
I tested three ways, but every failed, the code and error of each way are shown below.
import gensim
from gensim.models import Word2Vec
I first used this line:
model=Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin',binary=True)
print(model.wv.predict_output_word(['king','man'],topn=10))
error:
DeprecationWarning: Deprecated. Use gensim.models.KeyedVectors.load_word2vec_format instead.
Then I tried:
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin',binary=True)
print(model.wv.predict_output_word(['king','man'],topn=10))
error:
AttributeError: 'Word2VecKeyedVectors' object has no attribute 'predict_output_word'
error: _pickle.UnpicklingError: invalid load key, '3'.
I read the document at
https://radimrehurek.com/gensim/models/word2vec.html
but still have no idea the namespace where the predict_output_word would be in.
Anybody can help?
Thanks.
Upvotes: 2
Views: 3433
Reputation: 54173
The GoogleNews
set of vectors is just the raw vectors – without a full trained model (including internal weights). So it:
gensim
Word2Vec
modelKeyedVectors
, but that object alone doesn't have the data or protocols necessary for further model training or other functionalityGoogle hasn't released the full model that was used to create the GoogleNews
vector set.
Note also that the predict_output_word()
function in gensim
should be considered an experimental curiosity. It doesn't work in hierarchical-softmax models (because it's not as simple to generate ranked predictions). It doesn't quite match the same context-window weighting as is used during training.
Predicting words isn't really the point of the word2vec algorithm – and many imeplementations don't offer any interface for making individual word-predictions outside of the sparse bulk training process. Rather, word2vec uses the exercise of (sloppily) trying to make predictions to train word-vectors that turn out to be useful for other, non-word-prediction, purposes.
Upvotes: 1