tim_xyz
tim_xyz

Reputation: 13481

Accuracy decreases during training of a single batch with Keras?

Normally when training a deep neural network with Keras, training accuracy increases during the time it trains on a single batch.

Like so,

2019-08-03 13:33:22 PST10/189 [>.............................] - ETA: 9s - loss: 0.6919 - acc: 0.8000
2019-08-03 13:33:22 PST20/189 [==>...........................] - ETA: 4s - loss: 0.6905 - acc: 0.9000
2019-08-03 13:33:22 PST40/189 [=====>........................] - ETA: 2s - loss: 0.6879 - acc: 0.9500
2019-08-03 13:33:22 PST60/189 [========>.....................] - ETA: 1s - loss: 0.6852 - acc: 0.9667
2019-08-03 13:33:22 PST80/189 [===========>..................] - ETA: 1s - loss: 0.6821 - acc: 0.9750
2019-08-03 13:33:22 PST90/189 [=============>................] - ETA: 1s - loss: 0.6806 - acc: 0.9778
2019-08-03 13:33:22 PST100/189 [==============>...............] - ETA: 0s - loss: 0.6785 - acc: 0.9800
2019-08-03 13:33:22 PST120/189 [==================>...........] - ETA: 0s - loss: 0.6764 - acc: 0.9667
2019-08-03 13:33:23 PST130/189 [===================>..........] - ETA: 0s - loss: 0.6743 - acc: 0.9692
2019-08-03 13:33:23 PST140/189 [=====================>........] - ETA: 0s - loss: 0.6721 - acc: 0.9714
2019-08-03 13:33:23 PST160/189 [========================>.....] - ETA: 0s - loss: 0.6691 - acc: 0.9688
2019-08-03 13:33:23 PST180/189 [===========================>..] - ETA: 0s - loss: 0.6650 - acc: 0.9722
2019-08-03 13:33:23 PST189/189 [==============================] - 1s 8ms/step - loss: 0.6630 - acc: 0.9735
2019-08-03 13:33:32 PSTEpoch 1/1

But sometimes (typically on later batches) accuracy decreases,

2019-08-03 13:51:37 PST10/190 [>.............................] - ETA: 1s - loss: 0.0114 - acc: 1.0000
2019-08-03 13:51:37 PST20/190 [==>...........................] - ETA: 0s - loss: 0.0073 - acc: 1.0000
2019-08-03 13:51:37 PST30/190 [===>..........................] - ETA: 0s - loss: 0.0067 - acc: 1.0000
2019-08-03 13:51:37 PST40/190 [=====>........................] - ETA: 0s - loss: 0.0105 - acc: 1.0000
2019-08-03 13:51:37 PST50/190 [======>.......................] - ETA: 0s - loss: 0.0785 - acc: 0.9800
2019-08-03 13:51:37 PST60/190 [========>.....................] - ETA: 0s - loss: 0.0729 - acc: 0.9833
2019-08-03 13:51:37 PST70/190 [==========>...................] - ETA: 0s - loss: 0.0632 - acc: 0.9857
2019-08-03 13:51:37 PST80/190 [===========>..................] - ETA: 0s - loss: 0.1083 - acc: 0.9750
2019-08-03 13:51:37 PST90/190 [=============>................] - ETA: 0s - loss: 0.1396 - acc: 0.9667
2019-08-03 13:51:37 PST100/190 [==============>...............] - ETA: 0s - loss: 0.1291 - acc: 0.9700
2019-08-03 13:51:37 PST110/190 [================>.............] - ETA: 0s - loss: 0.1180 - acc: 0.9727
2019-08-03 13:51:37 PST120/190 [=================>............] - ETA: 0s - loss: 0.1133 - acc: 0.9750
2019-08-03 13:51:38 PST130/190 [===================>..........] - ETA: 0s - loss: 0.1050 - acc: 0.9769
2019-08-03 13:51:38 PST140/190 [=====================>........] - ETA: 0s - loss: 0.0980 - acc: 0.9786
2019-08-03 13:51:38 PST150/190 [======================>.......] - ETA: 0s - loss: 0.0923 - acc: 0.9800
2019-08-03 13:51:38 PST160/190 [========================>.....] - ETA: 0s - loss: 0.0866 - acc: 0.9812
2019-08-03 13:51:38 PST170/190 [=========================>....] - ETA: 0s - loss: 0.0848 - acc: 0.9824
2019-08-03 13:51:38 PST180/190 [===========================>..] - ETA: 0s - loss: 0.0802 - acc: 0.9833
2019-08-03 13:51:38 PST190/190 [==============================] - 1s 5ms/step - loss: 0.0762 - acc: 0.9842

I do keep a test set aside so I'm not terribly worried about this. I'm just wondering how this is even possible as the model optimizes itself for that single batch?

The model for reference:

model = Sequential()
model.add(Dense(12, input_dim=191226, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

...

model.fit(X_train_a, y_train_a, epochs=1, batch_size=10)

Upvotes: 0

Views: 934

Answers (1)

Dr. Snoopy
Dr. Snoopy

Reputation: 56357

The accuracy you see in the progress bar is actually a running mean of the per-batch accuracy, so it can go up or down, the model does not have to necessarily have increasing accuracy in every batch.

Also note that the loss is being minimized, not accuracy directly, and it is perfectly possible for loss to go down and accuracy to go down, but generally as loss goes down, accuracy goes up.

Upvotes: 5

Related Questions