dididaisy
dididaisy

Reputation: 141

gensim - Word2vec continue training on existing model - AttributeError: 'Word2Vec' object has no attribute 'compute_loss'

I am trying to continue training on an existing model,

model = gensim.models.Word2Vec.load('model/corpus.zhwiki.word.model')
more_sentences = [['Advanced', 'users', 'can', 'load', 'a', 'model', 'and', 'continue', 'training', 'it', 'with', 'more', 'sentences']]    
model.build_vocab(more_sentences, update=True)
model.train(more_sentences, total_examples=model.corpus_count, epochs=model.iter)

but I got an error with the last line:

AttributeError: 'Word2Vec' object has no attribute 'compute_loss'

Some posts said it's caused by using a earlier version of gensim, and I have tried to add this after loading the existing model and before train().

model.compute_loss = False

After that, it didn't give me the AttributeError, but the output of model.train() is 0, and model didn't trained with new sentences.

enter image description here

How to solve this problem?

Upvotes: 6

Views: 5470

Answers (2)

Haha TTpro
Haha TTpro

Reputation: 5546

Here is how I continues training my model

# training_data: initial training data. contain list of tokenized sentences
model = Word2Vec(training_data, size=50, window=5, min_count=10, workers=4)

# datasmall: more sentences
# total_examples: number of additional sentence
# epochs: provide your current epochs. model.epochs is ok 
model.train(datasmall, total_examples=len(datasmall), epochs=model.epochs)

Upvotes: 7

gojomo
gojomo

Reputation: 54153

The total_examples (and epochs) arguments to train() should match what you're currently providing, in your more_sentences – not leftover values from prior training.

So for example, given your code showing just a single additional sentence, you'd specify total_examples=1.

If this isn't the source of the problem, double check that more_sentences is what you expect it to be at the time of the train() call.

Upvotes: 1

Related Questions