Reputation: 897
I am currently in the process of planning my first Conv. NN implementation in Tensorflow, and have been reading many of the tutorials available on Tensorflow's website for insight.
It seems that there are essentially two ways to create a custom CNN:
1) Use Tensorflow layers module tf.layers
, which is the "high-level API". Using this method, you define a model definition function consisting of tf.layers
objects, and in the main function, instantiate a tf.learn.Estimator
, passing the model definition function to it. From here, the fit()
and evaluate()
methods can be called on the Estimator
object, which train and validate, respectively. Link: https://www.tensorflow.org/tutorials/layers. Main function below:
def main(unused_argv):
# Load training and eval data
mnist = learn.datasets.load_dataset("mnist")
train_data = mnist.train.images # Returns np.array
train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
eval_data = mnist.test.images # Returns np.array
eval_labels = np.asarray(mnist.test.labels, dtype=np.int32)
# Create the Estimator
mnist_classifier = learn.Estimator(
model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model")
# Set up logging for predictions
# Log the values in the "Softmax" tensor with label "probabilities"
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
tensors=tensors_to_log, every_n_iter=50)
# Train the model
mnist_classifier.fit(
x=train_data,
y=train_labels,
batch_size=100,
steps=20000,
monitors=[logging_hook])
# Configure the accuracy metric for evaluation
metrics = {
"accuracy":
learn.MetricSpec(
metric_fn=tf.metrics.accuracy, prediction_key="classes"),
}
# Evaluate the model and print results
eval_results = mnist_classifier.evaluate(
x=eval_data, y=eval_labels, metrics=metrics)
print(eval_results)
Full code here
2) Use Tensorflow's "low-level API" in which layers are defined in a definition function. Here, layers are manually defined, and the user must perform many calculations manually. In the main function, the user starts a tf.Session()
, and manually configures training/validation using for loop(s). Link: https://www.tensorflow.org/get_started/mnist/pros. Main function below:
def main(_):
# Import data
mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True)
# Create the model
x = tf.placeholder(tf.float32, [None, 784])
# Define loss and optimizer
y_ = tf.placeholder(tf.float32, [None, 10])
# Build the graph for the deep net
y_conv, keep_prob = deepnn(x)
with tf.name_scope('loss'):
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(labels=y_,
logits=y_conv)
cross_entropy = tf.reduce_mean(cross_entropy)
with tf.name_scope('adam_optimizer'):
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
with tf.name_scope('accuracy'):
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
correct_prediction = tf.cast(correct_prediction, tf.float32)
accuracy = tf.reduce_mean(correct_prediction)
graph_location = tempfile.mkdtemp()
print('Saving graph to: %s' % graph_location)
train_writer = tf.summary.FileWriter(graph_location)
train_writer.add_graph(tf.get_default_graph())
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i % 100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x: batch[0], y_: batch[1], keep_prob: 1.0})
print('step %d, training accuracy %g' % (i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
print('test accuracy %g' % accuracy.eval(feed_dict={
x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
Full code here
My dilemma is, I like the simplicity of defining the neural network using tf.layers
(option 1), but I want the customizability of the training that the "low-level API" (option 2) provides. Specifically, when using the tf.layers
implementation, is there a way to report validation accuracy every n iterations of training? Or more generally, can I train/validate using the tf.Session()
, or am I confined to using the tf.learn.Estimator
's fit()
and evaluate()
methods?
It seems odd that one would want a final evaluation score after all training is complete, as I thought the whole point of validation is to track network progression during training. Otherwise, what would be the difference between validation and testing?
Any help would be appreciated.
Upvotes: 7
Views: 4213
Reputation: 970
You're nearly right however tf.layers
is separate from the Estimator
class of functions etc. If you wanted to you could use tf.Layers to define your layers but then build your own training loops or whatever else you like. You can think of tf.Layers
just being those functions that you could create in your second option above.
If you are interested in being able to build up a basic model quickly but being able to extend it with other functions, your own training loops etc. then there's no reason you can't use layers to build your model and interact with it however you wish.
tf.Layers
- https://www.tensorflow.org/api_docs/python/tf/layers
tf.Estimator
- https://www.tensorflow.org/api_docs/python/tf/estimator
Upvotes: 4