Reputation: 785
I am training a deep network (GoogleNet) on an image classification issue. I have a dataset of around 7300 images labelized among only 2 classes.
I divided my set in a training and validating set in these proportions : 0.66 / 0.33.
During the training I compute the mean error on the training set and on the testing set to see how it evolves.
The thing is that these two values are always equal (or reeeeally close).
So maybe it is not an issue but I did not expect that to happen. Since I am training on my training set I expected the mean error on my training set to always be ess that the mean error on my testing set (even if I hoped for these two value to converge towards around the same value).
Maybe someone here could tell me if it is normal or not ? And in the case of it being expected, why ? and if it's not, any idea on what is going on ?
Further info that might be useful : I use mini batches of 50, adam optimizer, my loss is computed with tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y_predict), I use a dropout of 0.4 (but when I compute the mean error I make sure it is 1).
Thank you.
Upvotes: 0
Views: 71
Reputation: 77857
This is quite reasonable. You partitioned your data into two random samples from the same population. Yes, they should have nearly identical averages, given the sizes of the samples. This is a simple effect of the law of large numbers: samples taken from the same population will tend to have the same mean.
Upvotes: 1