Reputation: 5093
I have a Tensorflow R1.13 training code that saves a SavedModel periodically during a long training run (I am following this excellent article on the topic). I have noticed that each time the model is saved the size increases. In fact it seems that it increases exactly linearly each time, and seems to be a multiple of the initial file size. I wonder if TF is keeping a reference to all previous saved files and accumulating them for each later save. Below are the file sizes for several SavedModel files written in sequence over time, as training progresses.
-rw-rw-r-- 1 ubuntu ubuntu 576962 Apr 15 23:56 ./model_accuracy_0.361/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 1116716 Apr 15 23:58 ./model_accuracy_0.539/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 1656470 Apr 16 00:11 ./model_accuracy_0.811/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 2196440 Apr 16 00:15 ./model_accuracy_0.819/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 2736794 Apr 16 00:17 ./model_accuracy_0.886/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 3277150 Apr 16 00:19 ./model_accuracy_0.908/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 3817530 Apr 16 00:21 ./model_accuracy_0.919/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 4357950 Apr 16 00:25 ./model_accuracy_0.930/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 4898492 Apr 16 00:27 ./model_accuracy_0.937/saved_model.pb
Is there a way to cull out the previous saved versions? Or at least prevent them from being accumulated in the first place? I will certainly only keep the last file, but it seems to be 10x larger than it should be.
Below is my code (largely copied from Silva):
# Creates the TensorInfo protobuf objects that encapsulates the input/output tensors
tensor_info_input_data_1 = tf.saved_model.utils.build_tensor_info(gd.data_1)
tensor_info_input_data_2 = tf.saved_model.utils.build_tensor_info(gd.data_2)
tensor_info_input_keep = tf.saved_model.utils.build_tensor_info(gd.keep )
# output tensor info
tensor_info_output_pred = tf.saved_model.utils.build_tensor_info(gd.targ_pred_oneh)
tensor_info_output_soft = tf.saved_model.utils.build_tensor_info(gd.targ_pred_soft)
# Define the SignatureDef for this export
prediction_signature = \
tf.saved_model.signature_def_utils.build_signature_def(
inputs={
'data_1': tensor_info_input_data_1,
'data_2': tensor_info_input_data_2,
'keep' : tensor_info_input_keep
},
outputs={
'pred_orig': tensor_info_output_pred,
'pred_soft': tensor_info_output_soft
},
method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME)
graph_entry_point_name = "my_model" # The logical name for the model in TF Serving
try:
builder = tf.saved_model.builder.SavedModelBuilder(saved_model_path)
builder.add_meta_graph_and_variables(
sess= sess,
tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map = {graph_entry_point_name:prediction_signature}
)
builder.save(as_text=False)
if verbose:
print(" SavedModel graph written successfully. " )
success = True
except Exception as e:
print(" WARNING::SavedModel write FAILED. " )
traceback.print_tb(e.__traceback__)
success = False
return success
Upvotes: 3
Views: 554
Reputation: 652
Set clear_extraneous_saver = True for Saver
meta_graph_def = saver.export_meta_graph( clear_devices=clear_devices, clear_extraneous_savers=True, strip_default_attrs=strip_default_attrs)
Upvotes: 0
Reputation: 415
@Hephaestus,
If you're constructing a SavedModelBuilder
each time, then it'll add new save operations to the graph every time you save
.
Instead, you can construct SavedModelBuilder
only once and just call builder.save
repeatedly. This will not add new ops to the graph on each save
call.
Alternatively I think you can create your own tf.train.Saver
and pass it to add_meta_graph_and_variables
. Then it shouldn't create any new operations.
A good debugging aid is tf.get_default_graph().finalize()
once you're done graph building, which will throw an exception rather than expanding the graph like this.
Hope this helps.
Upvotes: 1