Reputation: 65
The following is a snippet of code used in training a simple neural network.
for epoch in range(hm_epochs):
epoch_loss = 0
for _ in range(int(mnist.train.num_examples/batch_size)):
epoch_x, epoch_y = mnist.train.next_batch(batch_size)
_, c = sess.run([optimizer, cost], feed_dict={x: epoch_x, y: epoch_y})
epoch_loss += c
print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)
This isn't the full code, but from what I see, the inner loop trains using all the test data (split into batches) and optimizes using an optimizing algorithm. With the inner loop running one the accuracy of the algorithm is 90% but when it runs 10 times (hm_epochs=10) it's correct 95% of the time. That doesn't make any sense to me, how does training it with the same data multiple times (which is what happens when the outer loop runs), make it any more accurate.
I am new to tensorflow.
This is not my code, it comes from this series: https://pythonprogramming.net/tensorflow-neural-network-session-machine-learning-tutorial/
Upvotes: 0
Views: 1193
Reputation: 167
The neural network model is only updated by a certain amount of information available in the sample. This behavior is tuned by learning rate for example.
Imagine turning a radio knob to tune a station. First you turn it fast to roughly get the radio station (90%), then you start turning slowly around to fine tune the radio station signal (95%).
Upvotes: 1
Reputation: 3974
So you're asking why training for more epochs yields better results? That should be obvious: the more epochs you train, the better the optimizer adapts the net's weights to the training set. The more you train your model, the better it fits the training data. But if you overdo it, it fails to generalize and therefore performs poorly on new data. This is known as overfitting.
Upvotes: 1