Reputation: 2415
I'm trying to create a single model out of two almost identical models, trained under different conditions and average their outputs inside tensorflow. We want the final model to have the same interface for inference.
We have saved a checkpoint of the two models, and here is how we are trying to solve the problem:
merged_graph = tf.Graph()
with merged_graph.as_default():
saver1 = tf.train.import_meta_graph('path_to_checkpoint1_model1.meta', import_scope='g1')
saver2 = tf.train.import_meta_graph('path_to_checkpoint1_model2.meta', import_scope='g2')
with tf.Session(graph=merged_graph) as sess:
saver1.restore(sess, 'path_to_checkpoint1_model1')
saver1.restore(sess, 'path_to_checkpoint1_model2')
sess.run(tf.global_variables_initializer())
# export as a saved_model
builder = tf.saved_model.builder.SavedModelBuilder(kPathToExportDir)
builder.add_meta_graph_and_variables(sess,
[tf.saved_model.tag_constants.SERVING],
strip_default_attrs=True)
builder.save()
There are at least 3 flaws with the above approach, and we have tried many routes but can't get this to work:
_
Expected exactly one main op in : model
Expected exactly one SavedModel main op. Found: [u'g1/group_deps', u'g2/group_deps']
The two models have their own Placeholder nodes for input (i.e. g1/Placeholder and g2/Placeholder after merging). We couldn't find a way to remove the Placeholder nodes to create a new one that feeds input to both models (we don't want a new interface where we need to feed data into two different placeholders).
The two graphs have their own init_all, restore_all nodes. We couldn't figure out how to combine these NoOp operations into single nodes. This is the same as problem #1.
We couldn't as well find a sample implementation of such mode ensembling inside tensorflow. A sample code might answer all the above questions.
Note: My two models were trained using tf.estimator.Estimator and exported as saved_models. As a result, they contain the main_op.
Upvotes: 1
Views: 3283
Reputation: 2415
I did not solve, but found a workaround for the above problem.
The main problem is that main_op node is added whenever a model is exported with the saved_model API. Since both my models were exported with this API, both had the main_op node, which would be imported into the new graph. Then, the new graph would contain two main_ops which will later fail to load as exactly one main op is expected.
The workaround I chose to use was not to export my final model with the saved_model API, but export with the old handy freeze_graph into a single .pb
file.
Here is my working code snippet:
# set some constants:
# INPUT_SHAPE, OUTPUT_NODE_NAME, OUTPUT_FILE_NAME,
# TEMP_DIR, TEMP_NAME, SCOPE_PREPEND_NAME, EXPORT_DIR
# Set path for trained models which are exported with the saved_model API
input_model_paths = [PATH_TO_MODEL1,
PATH_TO_MODEL2,
PATH_TO_MODEL3, ...]
num_model = len(input_model_paths)
def load_model(sess, path, scope, input_node):
tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING],
path,
import_scope=scope,
input_map={"Placeholder": input_node})
output_tensor = tf.get_default_graph().get_tensor_by_name(
scope + "/" + OUTPUT_NODE_NAME + ":0")
return output_tensor
with tf.Session(graph=tf.Graph()) as sess:
new_input = tf.placeholder(dtype=tf.float32,
shape=INPUT_SHAPE, name="Placeholder")
output_tensors = []
for k, path in enumerate(input_model_paths):
output_tensors.append(load_model(sess,
path,
SCOPE_PREPEND_NAME+str(k),
new_input))
# Mix together the outputs (e.g. sum, weighted sum, etc.)
sum_outputs = output_tensors[0] + output_tensors[1]
for i in range(2, num_model):
sum_outputs = sum_outputs + output_tensors[i]
final_output = tf.divide(sum_outputs, float(num_model), name=OUTPUT_NODE_NAME)
# Save checkpoint to be loaded later by the freeze_graph!
saver_checkpoint = tf.train.Saver()
saver_checkpoint.save(sess, os.path.join(TEMP_DIR, TEMP_NAME))
tf.train.write_graph(sess.graph_def, TEMP_DIR, TEMP_NAME + ".pbtxt")
freeze_graph.freeze_graph(
os.path.join(TEMP_DIR, TEMP_NAME + ".pbtxt"),
"",
False,
os.path.join(TEMP_DIR, TEMP_NAME),
OUTPUT_NODE_NAME,
"", # deprecated
"", # deprecated
os.path.join(EXPORT_DIR, OUTPUT_FILE_NAME),
False,
"")
Upvotes: 0
Reputation: 1318
for question 1, saved_model is not a must
for question 2, input_map
arg in tf.train.import_meta_graph
can be used
for question 3, you real do not need restore all or initialize all ops any more
this code snapshot can show you how you can combine two graphs and average their outputs in tensorflow:
import tensorflow as tf
merged_graph = tf.Graph()
with merged_graph.as_default():
input = tf.placeholder(dtype=tf.float32, shape=WhatEverYourShape)
saver1 = tf.train.import_meta_graph('path_to_checkpoint1_model1.meta', import_scope='g1',
input_map={"YOUR/INPUT/NAME": input})
saver2 = tf.train.import_meta_graph('path_to_checkpoint1_model2.meta', import_scope='g2',
input_map={"YOUR/INPUT/NAME": input})
output1 = merged_graph.get_tensor_by_name("g1/YOUR/OUTPUT/TENSOR/NAME")
output2 = merged_graph.get_tensor_by_name("g2/YOUR/OUTPUT/TENSOR/NAME")
final_output = (output1 + output2) / 2
with tf.Session(graph=merged_graph) as sess:
saver1.restore(sess, 'path_to_checkpoint1_model1')
saver1.restore(sess, 'path_to_checkpoint1_model2')
# this line should NOT run because it will initialize all variables, your restore op will have no effect
# sess.run(tf.global_variables_initializer())
fianl_output_numpy = sess.run(final_output, feed_dict={input: YOUR_NUMPY_INPUT})
Upvotes: 0