Antonio Albanese
Antonio Albanese

Reputation: 68

Tensorflow stuck for seconds at the end of every epoch

I'm training a Neural Network over a TFRecordDataset. However, at the end of every epoch, i.e. with ETA: 0s, the training gets stuck for tens of seconds. For reference, one epoch takes around a minute to be completed over a dataset of around 25GB (before parsing a subset of the features).

I'm running TensorFlow 2.3.1 with a Nvidia Titan RTX GPU. Is this the intended behavior? Maybe due to the preprocessing in the input pipeline? Is that preprocessing performed by the CPU only or offloaded to the GPU? Thanks!

Upvotes: 0

Views: 518

Answers (1)

Nicolas Gervais
Nicolas Gervais

Reputation: 36604

If you have a validation set and you're using model.fit(), it's probably the time it takes to calculate the loss and the metrics. In most cases, it should take an extra 25% to compute the metrics of a 80/20 split.

Upvotes: 2

Related Questions