nastaran
nastaran

Reputation: 132

What does steps mean in the train method of tf.estimator.Estimator?

I'm completely confused with the meaning of epochs, and steps. I also read the issue What is the difference between steps and epochs in TensorFlow?, But I'm not sure about the answer. Consider this part of code:

EVAL_EVERY_N_STEPS = 100
MAX_STEPS = 10000

nn = tf.estimator.Estimator(
        model_fn=model_fn,
        model_dir=args.model_path,
        params={"learning_rate": 0.001},
        config=tf.estimator.RunConfig())

for _ in range(MAX_STEPS // EVAL_EVERY_N_STEPS):
        print(_)

        nn.train(input_fn=train_input_fn,
                 hooks=[train_qinit_hook, step_cnt_hook],
                 steps=EVAL_EVERY_N_STEPS)

        if args.run_validation:
            results_val = nn.evaluate(input_fn=val_input_fn,
                                      hooks=[val_qinit_hook, 
                                      val_summary_hook],
                                      steps=EVAL_STEPS)

            print('Step = {}; val loss = {:.5f};'.format(
                results_val['global_step'],
                results_val['loss']))
end

Also, the number of training samples is 400. I consider the MAX_STEPS // EVAL_EVERY_N_STEPS equal to epochs (or iterations). Indeed, the number of epochs is 100. What does the steps mean in nn.train?

Upvotes: 0

Views: 1081

Answers (1)

Olivier Dehaene
Olivier Dehaene

Reputation: 1680

In Deep Learning:

  • an epoch means one pass over the entire training set.
  • a step or iteration corresponds to one forward pass and one backward pass.

If your dataset is not divided and passed as is to your algorithm, each step corresponds to one epoch, but usually, a training set is divided into N mini-batches. Then, each step goes through one batch and you need N steps to complete a full epoch.

Here, if batch_size == 4 then 100 steps are indeed equal to one epoch.

epochs = batch_size * steps // n_training_samples

Upvotes: 3

Related Questions