Reputation: 1603
I'm attempting to load some pre-trained vectors into a gensim Word2Vec
model, so they can be retrained with new data. My understanding is I can do the retraining with gensim.Word2Vec.train()
. However, the only way I can find to load the vectors is with gensim.models.KeyedVectors.load_word2vec_format('path/to/file.bin', binary=True)
which creates an object of what is usually the wv
attribute of a gensim.Word2Vec
model. But this object, on it's own, does not have a train()
method, which is what I need to retrain the vectors.
So how do I get these vectors into an actual gensim.Word2Vec
model?
Upvotes: 6
Views: 3575
Reputation: 53758
Word2Vec.load
is not deprecated, so you can use it, assuming that your pre-trained model has been saved with Word2Vec.save
.
# Train and save the model
model = Word2Vec(size=100, window=4, min_count=5, workers=4)
model.build_vocab(sentences)
model.train(sentences, total_examples=model.corpus_count, epochs=50)
model.save('word-vectors.bin')
...
# Later in another script: load and continue training
model = Word2Vec.load('word-vectors.bin')
model.train(sentences, total_examples=model.corpus_count, epochs=50)
Upvotes: 1