HereItIs
HereItIs

Reputation: 154

Keras validation progress output incorrectly shows training steps

When I call model.fit_generator() to my model, it shows training progress as you would expect in the output. However, it finishes up one short of the max, then moves onto the validation. The validation shows the same progress bar as the training, even though validation steps are completely different (~70k training steps/8k val steps). The validation progress bar stops when it hits the 8k steps e.g:

75999/76000 [===========================>..] - ETA: 0s - loss: 0.4556 - acc: 0.840Epoch 1/500
8200/76000 [====>........................] - ETA: 0s - loss: 0.9822 - acc: 0.7564

The first line is the training, second is validation.

When I change the steps manually so that there are less training steps than val steps, I get the following output:

19/20 [===========================>..] - ETA: 0s - loss: 0.4558 - acc: 0.8980Epoch 1/500
19/20 [===========================>..] - ETA: 0s - loss: 0.8200 - acc: 0.7730

It pauses on this output whilst the rest of the validation steps occur. The output for the rest of the validation is not shown in the progress bar.

This error occurs when val_steps and train_steps are generated from my generator or when i set them manually as above, so the issue is not with my generator - I think. Here is my call to fit_generator() (it is the same when i use .fit())

model.fit_generator(
                                train_generator,
                                steps_per_epoch=train_steps,
                                epochs=epochs,
                                validation_data=val_generator,
                                validation_steps=val_steps,
                                verbose=1,
                                callbacks=[weight_saving_callback,early_stopping],
                                max_queue_size=40,
                                workers=1,
                                use_multiprocessing=False,
                                #train_class_weight=None, #because we are not using target classes
                                #val_class_weight=None, #because we are not using target classes
                                validation_freq=1)

Can anyone see where this bug is? I dont think its affecting the training process, simply the output - but I can't figure out where the problem is. Using TF 2.1 and Keras 2.3.1.

Simply put: why does the validation progress bar not show the correct number of validation steps?

Upvotes: 1

Views: 361

Answers (1)

Gerry P
Gerry P

Reputation: 8092

My experience is that if you are trying to print out your own information at the end of an epoch it messes up the tensorflow printout. What I ended up doing is to create class variables for the items I want to print out and pass them to an on_epoch_begin function and print the information out there. That does not seem to mess up the printout from tensorflow at the end of an epoch.

Upvotes: 1

Related Questions