cerebrou
cerebrou

Reputation: 5540

How are weights saved in the CIFAR10 tutorial for tensorflow?

In the TensorFlow tutorial to train a network on CIFAR-10, where and how do they save the weights/parameters between running training and evaluation? I cannot see any files saved to my project directory.

Here are the links to the tutorial and the code: https://www.tensorflow.org/versions/r0.11/tutorials/deep_cnn/index.html https://github.com/tensorflow/tensorflow/tree/master/tensorflow/models/image/cifar10

Upvotes: 1

Views: 743

Answers (2)

BernardoGO
BernardoGO

Reputation: 1856

It saves the logs and checkpoints to the /tmp/ folder by default. The weights are included in the checkpoint files.

As you can see in both eval and train files, it does take a checkpoint dir as parameter.

cifar10_train.py:

tf.app.flags.DEFINE_string('train_dir', '/tmp/cifar10_train',
                           """Directory where to write event logs """
                           """and checkpoint.""")

cifar10_eval.py:

tf.app.flags.DEFINE_string('eval_dir', '/tmp/cifar10_eval',
                           """Directory where to write event logs.""")
tf.app.flags.DEFINE_string('eval_data', 'test',
                           """Either 'test' or 'train_eval'.""")
tf.app.flags.DEFINE_string('checkpoint_dir', '/tmp/cifar10_train',
                           """Directory where to read model checkpoints.""")

You can call those scripts with custom values for those. For my project using Inception I have to change it since the main hard drive does not have enough space for the bottlenecks created by inception.

It might be a good practice to explicitly set those values since the /tmp/ folder is not persistent and thus you might lose your training data.

The following code will save the training data into a custom folder.

python cifar10_train.py --train_dir="/home/username/train_folder"

and then, to evaluate:

python cifar10_eval.py --checkpoint_dir="/home/username/train_folder"

It also applies to the other examples.

Upvotes: 0

Daniel De Freitas
Daniel De Freitas

Reputation: 2653

Let's assume you're running cifar10_train, saving happens on this line:

https://github.com/tensorflow/tensorflow/blob/r0.11/tensorflow/models/image/cifar10/cifar10_train.py#L122

And the default location is defined in this line (it's "/tmp/cifar10_train"):

https://github.com/tensorflow/tensorflow/blob/r0.11/tensorflow/models/image/cifar10/cifar10_train.py#L51

In cifar10_eval, restoring the weights happens on this line:

https://github.com/tensorflow/tensorflow/blob/r0.11/tensorflow/models/image/cifar10/cifar10_eval.py#L75

Upvotes: 0

Related Questions