TvE
TvE

Reputation: 1116

How can I use a Cloud TPU with Tensorflow Lite Model Maker?

I'm training an object detection model (EfficientDet-Lite) using Tensorflow Lite Model Maker in Colab and I'd like to use a Cloud TPU. I have all the images in a GCS bucket and provide a CSV file. When I call object_detector.create I get the following error:

/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in shape(self)
   1196         # `_tensor_shape` is declared and defined in the definition of
   1197         # `EagerTensor`, in C.
-> 1198         self._tensor_shape = tensor_shape.TensorShape(self._shape_tuple())
   1199       except core._NotOkStatusException as e:
   1200         six.raise_from(core._status_to_exception(e.code, e.message), None)

InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /tmp/tfhub_modules/db7544dcac01f8894d77bea9d2ae3c41ba90574c/variables/variables: Unimplemented: File system scheme '[local]' not implemented (file: '/tmp/tfhub_modules/db7544dcac01f8894d77bea9d2ae3c41ba90574c/variables/variables')

That looks like it's trying to process some local files in the CloudTPU, which doesn't work...

The gist of what I'm doing is:

tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
train_data, validation_data, test_data = object_detector.DataLoader.from_csv(
    drive_dir + csv_name,
    images_dir = "images" if not tpu else None,
    cache_dir = drive_dir + "cub_cache",
)
spec = MODEL_SPEC(tflite_max_detections=10, strategy='tpu', tpu=tpu.master(), gcp_project="xxx")
model = object_detector.create(train_data=train_data, 
                               model_spec=spec, 
                               validation_data=validation_data, 
                               epochs=epochs, 
                               batch_size=batch_size,
                               train_whole_model=True)

I can't find any example with Model Maker that uses Cloud TPU.

Edit: the error seems to occur when the EfficientDet model gets loaded, so somehow modelmaker must be pointing to a local file that doesn't work for CloudTPU?

Upvotes: 0

Views: 474

Answers (2)

balu
balu

Reputation: 11

  1. Download from TF-Hub the model you would like to train (replace X: 0<=X<=4): https://tfhub.dev/tensorflow/efficientdet/liteX/feature-vector/1
  2. Extract the package twice until you get to the "keras_metadata.pb", "saved_model.pb" and "variables" folder
  3. Upload these files and folders on a Google Cloud Bucket
  4. Pass the uri argument to model_spec.get (https://www.tensorflow.org/lite/tutorials/model_maker_object_detection), pointing to the Cloud Bucket folder (in gs:// format)

Upvotes: 0

Allen Wang
Allen Wang

Reputation: 301

Yeah the error is happening with TFHub, which seems to be well known. Basically TF Hub loading tries to use a local cache which TPU doesn't have access to (and the Colab doesn't even provide). Check out https://github.com/tensorflow/hub/issues/604 which should get you past this error.

Upvotes: 1

Related Questions