Tenorflow load model error

Question

I'm trying to load a previous saved TENSOFLOW model (both graph and variables).

Here's how I exported the model during the training session

tf.global_variables_initializer().run()
y = tf.matmul(x, W) + b

cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

for batch_index in range(batch_size):
    batch_xs, batch_ys = sample_dataframe(train_df, N=batch_size)
    #print(batch_xs.shape)
    #print(batch_ys.shape)
    sess.run(train_step, feed_dict = {x: batch_xs, y_:batch_ys})

    if batch_index % 100 == 0:
        print("Batch "+str(batch_index))
        correct_predictions = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
        print("Accuracy: "+str(sess.run(accuracy,
                                                  feed_dict = {x: batch_xs, y_: batch_ys})))
        #print("Predictions "+str(y))
        #print("Training accuracy: %.1f%%" %accuracy())
    if batch_index + 1 == batch_size:
        #Save the trained model
        print("Exporting trained model")
        builder = saved_model_builder.SavedModelBuilder(EXPORT_DIR)
        builder.add_meta_graph_and_variables(sess, ['simple-MNIST'])
        builder.save(as_text=True)

Please ignore how the model is defined (it's just a toy example), and check only the last lines where the save method is invoked. Everything went fine and the model is correctly saved in the FS.

When I try lo load the exported model I always get the following error:

TypeError: Can not convert a MetaGraphDef into a Tensor or Operation.

Here's how I load the model:

with tf.Session() as sess:
  print(tf.saved_model.loader.maybe_saved_model_directory(export_dir))
  saved_model = tf.saved_model.loader.load(sess, ['simple-MNIST'], export_dir)

  sess.run(saved_model)

Any idea how to solve it? It seems that the model has been exported with a wrong format, but I can't figure out how to change it.

Here's a simple script for load the model and scoring it.

with tf.device("/cpu:0"):
  x = tf.placeholder(tf.float32, shape =(batch_size, 784))
  W = tf.Variable(tf.truncated_normal(shape=(784, 10), stddev=0.1))
  b = tf.Variable(tf.zeros([10]))
  y_ = tf.placeholder(tf.float32, shape=(batch_size, 10))

with tf.Session() as sess:
  tf.global_variables_initializer().run()

  print(tf.saved_model.loader.maybe_saved_model_directory(export_dir))
  saved_model = tf.saved_model.loader.load(sess, ['simple-MNIST'], export_dir)

  batch_xs, batch_ys = sample_dataframe(train_df, N=batch_size)
  y = tf.matmul(x, W) + b
  correct_predictions = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
  accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))    

 print("Test Accuracy: "+ str(sess.run(accuracy, feed_dict = {x: batch_xs, y_: batch_ys})))

Running this script in a brand new PYTHON context, will score the model with a very low accuracy (it seems that the load model method hasn't correctly set the graph variables)

Thank you!

kafman · Accepted Answer

I think the problem is that you cannot pass saved_model into sess.run. From the documentation of saved_model.loader.load:

Returns: The MetaGraphDef protocol buffer loaded in the provided session. This can be used to further extract signature-defs, collection-defs, etc.

So, what exactly would you expect from sess.run(saved_model) when saved_model is a MetaGraphDef? If I've understood the mechanics of load correctly, then the graph as well as the associated variables are restored in the session you pass to load(..) and hence your model is ready to use once load(..) finished. So you should be able to access variables, ops and tensors through the (default) graph as usual and there is no need to further deal with the returned MetaGraphDef object.

Here is more information about what a MetaGraphDef is: What is the TensorFlow checkpoint meta file?. From this it should be clear that it does not make sense to use it with sess.run().

Edit

Following up on your edit: The function tf.saved_model.loader.load internally calls tf.import_meta_graph followed by saver.restore, i.e. it restores both the graph and the values of the variables present in the graph. Thus, it should not be necessary for you to redefine the variables yourself in the beginning of the code snippet you added. In fact, it might cause undefined behavior as then some nodes might exist twice in the default graph. Check this stackoverflow post for more info: Restoring Tensorflow model and viewing variable value. So I guess what is happening here is: the inference step uses the untrained variable W that you created manually instead of the pretrained one that you load through saved_model.loader, which is why you see the low accuracy.

So, my guess is that if you omit the definitions of x, W, b and y_ in the beginning and retrieve them from the restored graph e.g. through calling tf.get_default_graph().get_tensor_by_name('variable_name')) it should work fine.

PS: If you are restoring a model there is no need to run the initializer (even though I assume it doesn't hurt either).

PPS: In your script you are computing the accuracy 'by hand', but I would assume that this operation is already present in the model as it is most likely needed during training as well, no? So instead of calculating the accuracy from hand again, you could just fetch the respective node from the graph and use that.

Tenorflow load model error

Answers (1)

Related Questions