Reputation: 11
1.I use IntelliJ IDEA build a maven project,code is as follows:
System.out.println("Load data....");
SentenceIterator iter = new LineSentenceIterator(new File("/home/zs/programs/deeplearning4j-master/dl4j-test-resources/src/main/resources/raw_sentences.txt"));
iter.setPreProcessor(new SentencePreProcessor() {
@Override
return sentence.toLowerCase();
}
});
System.out.println("Build model....");
int batchSize = 1000;
int iterations = 30;
int layerSize = 300;
com.sari.Word2Vec vec= new com.sari.Word2Vec.Builder()
.batchSize(batchSize) //# words per minibatch.
.sampling(1e-5) // negative sampling. drops words out
.minWordFrequency(5) //
.useAdaGrad(false) //
.layerSize(layerSize) // word feature vector size
.iterations(iterations) // # iterations to train
.learningRate(0.025) //
.minLearningRate(1e-2) // learning rate decays wrt # words. floor learning
.negativeSample(10) // sample size 10 words
.iterate(iter) //
.tokenizerFactory(tokenizer)
.build();
vec.fit();
System.out.println("Evaluate model....");
double cosSim = vec.similarity("day" , "night");
System.out.println("Similarity between day and night: "+cosSim);
This code is reference the word2vec in deeplearning4j,but the result is unstable.The results of each experiment were very different.for example, with the cosine value of the similarity between 'day'and 'night', sometimes the result is as high as 0.98, sometimes as low as 0.4?
Here are the results of two experiments
Evaluate model....
Similarity between day and night: 0.706292986869812
Evaluate model....
Similarity between day and night: 0.5550910234451294
Why the result like this.Because I have just started learning word2vec, there are a lot of knowledge is not understood, I hope that seniors can help me,thanks!
Upvotes: 1
Views: 2168
Reputation: 123
You have set the following line:
.minLearningRate(1e-2) // learning rate decays wrt # words. floor learning
But that is an extremely high learning rate. The high learning rate causes the model to not 'settle' in any state, but instead a few updates significantly changes the learned representation. That is not a problem during the first few updates, but bad for convergence.
Solution: Allow learning rate to decay. You can leave this line out completely, or if you must you can use a more appropriate value, such as 1e-15
Upvotes: 1