Kendall Weihe
Kendall Weihe

Reputation: 2075

Tensorflow save model: GraphDef cannot be larger than 2GB

I'm getting the following error -- apparently at the time of saving my model

Step = 1799  |  Tensorflow Accuracy = 1.0
Step = 1799  |  My Accuracy = 0.0363355780022
Step = 1800  |  Tensorflow Accuracy = 1.0
Step = 1800  |  My Accuracy = 0.0364694929089
Traceback (most recent call last):
  File "CNN-LSTM-seg-reg-sigmoid.py", line 290, in <module>
    saver.save(sess, save_path)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1085, in save
    self.export_meta_graph(meta_graph_filename)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1103, in export_meta_graph
    add_shapes=True),
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2175, in as_graph_def
    result, _ = self._as_graph_def(from_version, add_shapes)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2138, in _as_graph_def
    raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.

Here suggested to look out for tf.constants, but I have zero constants in my program. However, my weights and biases are like the following: tf.Variable(tf.random_normal([32]),name="bc1"). Could this be an issue?

If not that, than this tells me that somewhere I am adding to the graph after every loop iteration, but I'm unsure where it is occuring.

My first guess is when I make predictions. I make predictions via the following code...

# Make prediction
im = Image.open('/home/volcart/Documents/Data/input_crops/temp data0001.tif')
batch_x = np.array(im)
batch_x = batch_x.reshape((1, n_input_x, n_input_y))
batch_x = batch_x.astype(float)
prediction = sess.run(pred, feed_dict={x: batch_x})
prediction = tf.sigmoid(prediction.reshape((n_input_x * n_input_y, n_classes)))
prediction = prediction.eval().reshape((n_input_x, n_input_y, n_classes))

My second guess is when I calculate loss and accuracy via the following: loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x, y: batch_y})

My entire session code looks like the following:

# Initializing the variables
init = tf.initialize_all_variables()
saver = tf.train.Saver()

gpu_options = tf.GPUOptions()
config = tf.ConfigProto(gpu_options=gpu_options)
config.gpu_options.allow_growth = True

# Launch the graph
with tf.Session(config=config) as sess:
    sess.run(init)
    summary = tf.train.SummaryWriter('/tmp/logdir/', sess.graph) #initialize graph for tensorboard
    step = 1
    # Import data
    data = scroll_data.read_data('/home/volcart/Documents/Data/')
    # Keep training until reach max iterations
    while step * batch_size < training_iters:
        batch_x, batch_y = data.train.next_batch(batch_size)
        # Run optimization op (backprop)
        batch_x = batch_x.reshape((batch_size, n_input_x, n_input_y))
        batch_y = batch_y.reshape((batch_size, n_input_x, n_input_y))
        batch_y = convert_to_2_channel(batch_y, batch_size)
        sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
        step = step + 1

        loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                          y: batch_y})


        # Make prediction
        im = Image.open('/home/volcart/Documents/Data/input_crops/temp data0001.tif')
        batch_x = np.array(im)
        batch_x = batch_x.reshape((1, n_input_x, n_input_y))
        batch_x = batch_x.astype(float)
        prediction = sess.run(pred, feed_dict={x: batch_x})
        prediction = tf.sigmoid(prediction.reshape((n_input_x * n_input_y, n_classes)))
        prediction = prediction.eval().reshape((n_input_x, n_input_y, n_classes))

        # Temp arrays are to splice the prediction n_input_x x n_input_y x 2
            # into 2 matrices n_input_x x n_input_y
        temp_arr1 = np.empty((n_input_x, n_input_y))
        for i in xrange(n_input_x):
            for j in xrange(n_input_x):
                for k in xrange(n_classes):
                    if k == 0:
                        temp_arr1[i][j] = 1 - prediction[i][j][k]

        my_acc = accuracy_custom(temp_arr1,batch_y[0,:,:,0])

        print "Step = " + str(step) + "  |  Tensorflow Accuracy = " + str(acc)
        print "Step = " + str(step) + "  |  My Accuracy = " + str(my_acc)

        if step % 100 == 0:
            save_path = "/home/volcart/Documents/CNN-LSTM-reg-model/CNN-LSTM-seg-step-" + str(step) + "-model.ckpt"
            saver.save(sess, save_path)
            csv_file = "/home/volcart/Documents/CNN-LSTM-reg/CNNLSTMreg-step-" + str(step) + "-accuracy-" + str(my_acc) + ".csv"
            np.savetxt(csv_file, temp_arr1, delimiter=",")

Upvotes: 1

Views: 1996

Answers (2)

Soroosh Shalileh
Soroosh Shalileh

Reputation: 11

You can rewrite the below line of your code utilizing the tf.placeholder:

prediction = tf.sigmoid(prediction.reshape((n_input_x * n_input_y, n_classes)))

this will solve the issue.

Upvotes: 1

Yaroslav Bulatov
Yaroslav Bulatov

Reputation: 57893

You are growing your graph at this line:

prediction = tf.sigmoid(prediction.reshape((n_input_x * n_input_y, n_classes)))

This converts your prediction numpy array to TensorFlow constant node, inlines it into the Graph, and adds Sigmoid node on top of that.

You can catch problems like this by adding tf.get_default_graph().finalize() before starting your training loop

Upvotes: 2

Related Questions