Reputation: 26067
My models run really fast but they seem to slow down because I'm saving the best model (to load in another process); but I'm noticing the saving process itself slows down the processing. As in the early stages of the fitting each iteration is improving it's adding more and more latency.
I wonder if there is a way to save the best model AFTER X epochs or save it in the background so the model training isn't delayed by saving too often?
For clarity, this is how I'm running ModelCheckpoint
in Keras/TF2:
filepath="BestModel.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]
# fit the model
model.fit(x, y, epochs=40, batch_size=50, callbacks=callbacks_list)
Upvotes: 0
Views: 1603
Reputation: 33460
You can use save_freq
argument of ModelCheckpoint
callback to control the frequency of saving. By default, it is set to 'epoch'
which means it would save the model at the end of each epoch; however, it also could be set to an integer which determines the number of batches to pass to save the model. Here is the relevant part of documentation for reference:
save_freq:
'epoch'
or integer. When using'epoch'
, the callback saves the model after each epoch. When using integer, the callback saves the model at end of this many batches. If theModel
is compiled withexperimental_steps_per_execution=N
, then the saving criteria will be checked every Nth batch. Note that if the saving isn't aligned to epochs, the monitored metric may potentially be less reliable (it could reflect as little as 1 batch, since the metrics get reset every epoch). Defaults to'epoch'
.
Upvotes: 1