Reputation: 141
I am trying to continue training on an existing model,
model = gensim.models.Word2Vec.load('model/corpus.zhwiki.word.model')
more_sentences = [['Advanced', 'users', 'can', 'load', 'a', 'model', 'and', 'continue', 'training', 'it', 'with', 'more', 'sentences']]
model.build_vocab(more_sentences, update=True)
model.train(more_sentences, total_examples=model.corpus_count, epochs=model.iter)
but I got an error with the last line:
AttributeError: 'Word2Vec' object has no attribute 'compute_loss'
Some posts said it's caused by using a earlier version of gensim, and I have tried to add this after loading the existing model and before train().
model.compute_loss = False
After that, it didn't give me the AttributeError, but the output of model.train() is 0, and model didn't trained with new sentences.
How to solve this problem?
Upvotes: 6
Views: 5470
Reputation: 5546
Here is how I continues training my model
# training_data: initial training data. contain list of tokenized sentences
model = Word2Vec(training_data, size=50, window=5, min_count=10, workers=4)
# datasmall: more sentences
# total_examples: number of additional sentence
# epochs: provide your current epochs. model.epochs is ok
model.train(datasmall, total_examples=len(datasmall), epochs=model.epochs)
Upvotes: 7
Reputation: 54153
The total_examples
(and epochs
) arguments to train()
should match what you're currently providing, in your more_sentences
– not leftover values from prior training.
So for example, given your code showing just a single additional sentence, you'd specify total_examples=1
.
If this isn't the source of the problem, double check that more_sentences
is what you expect it to be at the time of the train()
call.
Upvotes: 1