In tensorflow estimator, what does it mean for num_epochs to be None?

Question

I am really confused about the documentation of tensorflow estimator tf.estimator.inputs.numpy_input_fn here, and specifically about the line on num_epochs:

num_epochs: Integer, number of epochs to iterate over data. If None will run forever.

If I set num_epochs to None, the training would run forever??
What does it even mean for it to run forever??

It doesn't make sense to me since I cannot imagine people would design the program in such a way that it might run forever.

Could someone explain?

ANSWER MY OWN Question: I think I've found the answer in here: https://www.tensorflow.org/versions/r1.3/get_started/input_fn#evaluating_the_model

Specifically, in the part Building the input_fn:

Two additional arguments are provided: num_epochs: controls the number of epochs to iterate over data. For training, set this to None, so the input_fn keeps returning data until the required number of train steps is reached. For evaluate and predict, set this to 1, so the input_fn will iterate over the data once and then raise OutOfRangeError. That error will signal the Estimator to stop evaluate or predict.

KRish · Accepted Answer

If num_epochs is None, your code will iterate over the dataset infinitely. It will run forever, allowing you to manually stop training whenever you want. You could, for example, manually monitor your training and testing losses (and/or any other metrics), to stop training when your model converges or you start overfitting.

In tensorflow estimator, what does it mean for num_epochs to be None?

Answers (1)

Related Questions