Roland Deschain
Roland Deschain

Reputation: 2870

tensorflow.saved_model.load() takes extremely long time to execute

I have the following function to download, store and load models from the tensorflow model zoo:

def load_object_detection_model(model_name: str):

    models = load_model_zoo_list()
    model_url = models[model_name]['url']
    model_filename = models[model_name]['filename']

    pretrained_path = os.path.join(os.path.dirname(__file__), "pretrained_models")
    os.makedirs(pretrained_path, exist_ok=True)

    get_file(fname=model_filename, origin=model_url, cache_dir=pretrained_path, cache_subdir='cptr', extract=True)

    loaded_model = tf.saved_model.load(os.path.join(pretrained_path, 'cptr', model_name, "saved_model"))

    return loaded_model

def load_model_zoo_list():
    """

    :return:
    """

    path = os.path.join(os.path.dirname(__file__), "model_zoo.json")
    with open(path, 'r') as f:
        model_zoo_json = json.load(f)

    return model_zoo_json

model_zoo.json

{
  "ssd_mobilenet_v2_320x320_coco17_tpu-8": {
    "url": "http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz",
    "filename": "ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz"
  }
}

The idea is tho simply add more models to the json later, ssd_mobilenet_v2_320x320_coco17_tpu-8 was simply chosen at the moment for testing.

The problem is the following. The line loaded_model = tf.saved_model.load(os.path.join(pretrained_path, 'cptr', model_name, "saved_model")) takes around 25-30 seconds to execute. The model is already downloaded at this point and the saved_model folder has a size of around 32Mb. I also tested with bigger models, which took even longer. Inference seems to be much to slow as well (compared to the speeds listed on the model zoo page).

Apart from that, the model seems to work.

What could be the reason for these models being so slow?

Upvotes: 1

Views: 1565

Answers (1)

elbe
elbe

Reputation: 1508

Got it! On the first model call, the graph is built, so the first call to the model is always slow. I tried your code on google colab using a GPU:

model = load_object_detection_model("ssd_mobilenet_v2_320x320_coco17_tpu-8")
%%time
a= model(np.random.randint(0, 255, size=(1, 320, 320, 3)).astype("uint8"))

CPU times: user 4.32 s, sys: 425 ms, total: 4.75 s
Wall time: 4.71 s

%%time
a= model(np.random.randint(0, 255, size=(1, 320, 320, 3)).astype("uint8"))

CPU times: user 124 ms, sys: 18.4 ms, total: 143 ms
Wall time: 85.4 ms

In the document, they say 22 ms for this model, but maybe they got a faster GPU.

Upvotes: 1

Related Questions