Reputation: 1026
I wish to run the training phase of my tensorflow code on my GPU while after I finish and store the results to load the model I created and run its test phase on CPU.
I have created this code (I have put a part of it, just for reference because it's huge otherwise, I know that the rules are to include a fully functional code and I apologise about that).
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.contrib.rnn.python.ops import rnn_cell, rnn
# Import MNIST data http://yann.lecun.com/exdb/mnist/
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
x_train = mnist.train.images
# Check that the dataset contains 55,000 rows and 784 columns
N,D = x_train.shape
tf.reset_default_graph()
sess = tf.InteractiveSession()
x = tf.placeholder("float", [None, n_steps,n_input])
y_true = tf.placeholder("float", [None, n_classes])
keep_prob = tf.placeholder(tf.float32,shape=[])
learning_rate = tf.placeholder(tf.float32,shape=[])
#[............Build the RNN graph model.............]
sess.run(tf.global_variables_initializer())
# Because I am using my GPU for the training, I avoid allocating the whole
# mnist.validation set because of memory error, so I gragment it to
# small batches (100)
x_validation_bin, y_validation_bin = mnist.validation.next_batch(batch_size)
x_validation_bin = binarize(x_validation_bin, threshold=0.1)
x_validation_bin = x_validation_bin.reshape((-1,n_steps,n_input))
for k in range(epochs):
steps = 0
for i in range(training_iters):
#Stochastic descent
batch_x, batch_y = mnist.train.next_batch(batch_size)
batch_x = binarize(batch_x, threshold=0.1)
batch_x = batch_x.reshape((-1,n_steps,n_input))
sess.run(train_step, feed_dict={x: batch_x, y_true: batch_y,keep_prob: keep_prob,eta:learning_rate})
if do_report_err == 1:
if steps % display_step == 0:
# Calculate batch accuracy
acc = sess.run(accuracy, feed_dict={x: batch_x, y_true: batch_y,keep_prob: 1.0})
# Calculate batch loss
loss = sess.run(total_loss, feed_dict={x: batch_x, y_true: batch_y,keep_prob: 1.0})
print("Iter " + str(i) + ", Minibatch Loss= " + "{:.6f}".format(loss) + ", Training Accuracy = " + "{:.5f}".format(acc))
steps += 1
# Validation Accuracy and Cost
validation_accuracy = sess.run(accuracy,feed_dict={x:x_validation_bin, y_true:y_validation_bin, keep_prob:1.0})
validation_cost = sess.run(total_loss,feed_dict={x:x_validation_bin, y_true:y_validation_bin, keep_prob:1.0})
validation_loss_array.append(final_validation_cost)
validation_accuracy_array.append(final_validation_accuracy)
saver.save(sess, savefilename)
total_epochs = total_epochs + 1
np.savez(datasavefilename,epochs_saved = total_epochs,learning_rate_saved = learning_rate,keep_prob_saved = best_keep_prob, validation_loss_array_saved = validation_loss_array,validation_accuracy_array_saved = validation_accuracy_array,modelsavefilename = savefilename)
After that, my model has been trained successfully and saved the relevant data, so I wish to load the file and do a final train and test part in the model but using my CPU this time. The reason is the GPU can't handle the whole dataset of mnist.train.images and mnist.train.labels.
So, manually I select this part and I run it:
with tf.device('/cpu:0'):
# Initialise variables
sess.run(tf.global_variables_initializer())
# Accuracy and Cost
saver.restore(sess, savefilename)
x_train_bin = binarize(mnist.train.images, threshold=0.1)
x_train_bin = x_train_bin.reshape((-1,n_steps,n_input))
final_train_accuracy = sess.run(accuracy,feed_dict={x:x_train_bin, y_true:mnist.train.labels, keep_prob:1.0})
final_train_cost = sess.run(total_loss,feed_dict={x:x_train_bin, y_true:mnist.train.labels, keep_prob:1.0})
x_test_bin = binarize(mnist.test.images, threshold=0.1)
x_test_bin = x_test_bin.reshape((-1,n_steps,n_input))
final_test_accuracy = sess.run(accuracy,feed_dict={x:x_test_bin, y_true:mnist.test.labels, keep_prob:1.0})
final_test_cost = sess.run(total_loss,feed_dict={x:x_test_bin, y_true:mnist.test.labels, keep_prob:1.0})
But I get an OMM GPU memory error, which it doesn't make sense to me since I think I have forced the program to rely on CPU. I did not put a command sess.close() in the first (training with batches) code, but I am not sure if this really the reason behind it. I followed this post actually for the CPU Any suggestions how to run the last part on CPU only?
Upvotes: 3
Views: 2384
Reputation: 5206
with tf.device()
statements only apply to graph building, not to execution, so doing sess.run
inside a device block is equivalent to not having the device at all.
To do what you want to do you need to build separate training and test graphs, which share variables.
Upvotes: 4