Hephaestus
Hephaestus

Reputation: 5093

Tensorflow SavedModel file size increases with each save

I have a Tensorflow R1.13 training code that saves a SavedModel periodically during a long training run (I am following this excellent article on the topic). I have noticed that each time the model is saved the size increases. In fact it seems that it increases exactly linearly each time, and seems to be a multiple of the initial file size. I wonder if TF is keeping a reference to all previous saved files and accumulating them for each later save. Below are the file sizes for several SavedModel files written in sequence over time, as training progresses.

-rw-rw-r-- 1 ubuntu ubuntu  576962 Apr 15 23:56 ./model_accuracy_0.361/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 1116716 Apr 15 23:58 ./model_accuracy_0.539/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 1656470 Apr 16 00:11 ./model_accuracy_0.811/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 2196440 Apr 16 00:15 ./model_accuracy_0.819/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 2736794 Apr 16 00:17 ./model_accuracy_0.886/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 3277150 Apr 16 00:19 ./model_accuracy_0.908/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 3817530 Apr 16 00:21 ./model_accuracy_0.919/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 4357950 Apr 16 00:25 ./model_accuracy_0.930/saved_model.pb
-rw-rw-r-- 1 ubuntu ubuntu 4898492 Apr 16 00:27 ./model_accuracy_0.937/saved_model.pb

Is there a way to cull out the previous saved versions? Or at least prevent them from being accumulated in the first place? I will certainly only keep the last file, but it seems to be 10x larger than it should be.

Below is my code (largely copied from Silva):

        # Creates the TensorInfo protobuf objects that encapsulates the input/output tensors
        tensor_info_input_data_1 = tf.saved_model.utils.build_tensor_info(gd.data_1)
        tensor_info_input_data_2 = tf.saved_model.utils.build_tensor_info(gd.data_2)
        tensor_info_input_keep   = tf.saved_model.utils.build_tensor_info(gd.keep  )

        # output tensor info
        tensor_info_output_pred = tf.saved_model.utils.build_tensor_info(gd.targ_pred_oneh)
        tensor_info_output_soft = tf.saved_model.utils.build_tensor_info(gd.targ_pred_soft)

        # Define the SignatureDef for this export
        prediction_signature = \
            tf.saved_model.signature_def_utils.build_signature_def(
                inputs={
                    'data_1': tensor_info_input_data_1,
                    'data_2': tensor_info_input_data_2,
                    'keep'  : tensor_info_input_keep
                },
                outputs={
                    'pred_orig': tensor_info_output_pred,
                    'pred_soft': tensor_info_output_soft
                },
                method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME)

        graph_entry_point_name = "my_model" # The logical name for the model in TF Serving

        try:
            builder = tf.saved_model.builder.SavedModelBuilder(saved_model_path)
            builder.add_meta_graph_and_variables(
                sess= sess,
                tags=[tf.saved_model.tag_constants.SERVING],
                signature_def_map = {graph_entry_point_name:prediction_signature}
            )
            builder.save(as_text=False)
            if verbose:
                print("  SavedModel graph written successfully. " )
            success = True
        except Exception as e:
            print("       WARNING::SavedModel write FAILED. " )
            traceback.print_tb(e.__traceback__)
            success = False
        return success

Upvotes: 3

Views: 554

Answers (2)

Deepak Sharma
Deepak Sharma

Reputation: 652

Set clear_extraneous_saver = True for Saver

https://github.com/tensorflow/tensorflow/blob/b78d23cf92656db63bca1f2cbc9636c7caa387ca/tensorflow/python/saved_model/builder_impl.py#L382

meta_graph_def = saver.export_meta_graph( clear_devices=clear_devices, clear_extraneous_savers=True, strip_default_attrs=strip_default_attrs)

Upvotes: 0

RakTheGeek
RakTheGeek

Reputation: 415

@Hephaestus,

If you're constructing a SavedModelBuilder each time, then it'll add new save operations to the graph every time you save.

Instead, you can construct SavedModelBuilder only once and just call builder.save repeatedly. This will not add new ops to the graph on each save call.

Alternatively I think you can create your own tf.train.Saver and pass it to add_meta_graph_and_variables. Then it shouldn't create any new operations.

A good debugging aid is tf.get_default_graph().finalize() once you're done graph building, which will throw an exception rather than expanding the graph like this.

Hope this helps.

Upvotes: 1

Related Questions