BenJacob
BenJacob

Reputation: 987

Tensorflow evaluation frequency

I am using the train_and_evaluate function in tensorflow and want to make the eval step happen more frequently (either by the global step or time elapsed). This is my code (model function is not shown).

def get_classifier(batch_size):
    config = tf.estimator.RunConfig(
        model_dir="models/shape_model_cnn_3",
        save_checkpoints_secs=300,
        save_summary_steps=100)

    params = tf.contrib.training.HParams(
        batch_size=batch_size,
        num_conv=[48,64,96], # Sizes of each convolutional layer
        conv_len=[2,3,4], # Kernel size of each convolutional layer
        num_nodes=128, # Number of LSTM nodes for each LSTM layer
        num_layers=3, # Number of LSTM layers
        num_classes=7, # Number of classes in final layer
        learning_rate=0.0001,
        gradient_clipping_norm=9.0,
        dropout=0.3)

    classifier = tf.estimator.Estimator(
        model_fn=my_model,
        config=config,
        params=params
    )

    return classifier

classifier = get_classifier(8)

train_spec = tf.estimator.TrainSpec(
    input_fn=lambda:input.batch_dataset("dataset/shape-train-???.tfrecords", tf.estimator.ModeKeys.TRAIN, 8),
    max_steps=100000
)

eval_spec = tf.estimator.EvalSpec(
    input_fn=lambda:input.batch_dataset("dataset/shape-eval-???.tfrecords", tf.estimator.ModeKeys.EVAL, 8)
)

tf.estimator.train_and_evaluate(classifier, train_spec, eval_spec)

I have tried using the start_delay_secs parameter in my EvalSpec, im not sure if this is what it is for but it doesn't seem to have any effect anyway

Upvotes: 3

Views: 2745

Answers (4)

ch9lb
ch9lb

Reputation: 51

When I set save_checkpoints_steps, it does run evaluation after the specified number of steps; The configuration :

tf.estimator.RunConfig(save_summary_steps=5, log_step_count_steps=3, save_checkpoints_steps=40)

gives an evaluation each 40 steps.

Upvotes: 1

LeckieNi
LeckieNi

Reputation: 466

Use tf.contrib.learn.Experiment instead.

For example:

experiment = tf.contrib.learn.Experiment(

    estimator=estimator,  # Estimator

    train_input_fn=train_input_fn,  # First-class function

    eval_input_fn=eval_input_fn,  # First-class function

    train_steps=params.train_steps,  # Minibatch steps

    min_eval_frequency=params.min_eval_frequency,  # Eval frequency

    train_monitors=[train_input_hook],  # Hooks for training

    eval_hooks=[eval_input_hook],  # Hooks for evaluation

    eval_steps=None  # Use evaluation feeder until its empty

)

learn_runner.run(

    experiment_fn=experiment,  # First-class function

    run_config=run_config,  # RunConfig

    schedule="train_and_evaluate",  # What to run

    hparams=params  # HParams

)

Upvotes: 0

BenJacob
BenJacob

Reputation: 987

I have found that there is a parameter in EvalSpec, `throttle_secs' which starts the evaluation stage after a number of seconds. Alternatively if you want to evaluate based on a number of steps, you can use a for loop and incrementally increase the max_steps as suggested by @Kathy Wu.

Upvotes: 0

kww
kww

Reputation: 549

You can set max_steps to a lower number in order to evaluate sooner.

This will reset the input function. Currently, there is no way to pause the input function and resume at the same state using estimator. We are looking into adding this feature.

Upvotes: 0

Related Questions