Reputation: 1116
I'm training an object detection model (EfficientDet-Lite) using Tensorflow Lite Model Maker in Colab and I'd like to use a Cloud TPU. I have all the images in a GCS bucket and provide a CSV file. When I call object_detector.create I get the following error:
/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in shape(self)
1196 # `_tensor_shape` is declared and defined in the definition of
1197 # `EagerTensor`, in C.
-> 1198 self._tensor_shape = tensor_shape.TensorShape(self._shape_tuple())
1199 except core._NotOkStatusException as e:
1200 six.raise_from(core._status_to_exception(e.code, e.message), None)
InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on /tmp/tfhub_modules/db7544dcac01f8894d77bea9d2ae3c41ba90574c/variables/variables: Unimplemented: File system scheme '[local]' not implemented (file: '/tmp/tfhub_modules/db7544dcac01f8894d77bea9d2ae3c41ba90574c/variables/variables')
That looks like it's trying to process some local files in the CloudTPU, which doesn't work...
The gist of what I'm doing is:
tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
train_data, validation_data, test_data = object_detector.DataLoader.from_csv(
drive_dir + csv_name,
images_dir = "images" if not tpu else None,
cache_dir = drive_dir + "cub_cache",
)
spec = MODEL_SPEC(tflite_max_detections=10, strategy='tpu', tpu=tpu.master(), gcp_project="xxx")
model = object_detector.create(train_data=train_data,
model_spec=spec,
validation_data=validation_data,
epochs=epochs,
batch_size=batch_size,
train_whole_model=True)
I can't find any example with Model Maker that uses Cloud TPU.
Edit: the error seems to occur when the EfficientDet model gets loaded, so somehow modelmaker must be pointing to a local file that doesn't work for CloudTPU?
Upvotes: 0
Views: 474
Reputation: 11
Upvotes: 0
Reputation: 301
Yeah the error is happening with TFHub, which seems to be well known. Basically TF Hub loading tries to use a local cache which TPU doesn't have access to (and the Colab doesn't even provide). Check out https://github.com/tensorflow/hub/issues/604 which should get you past this error.
Upvotes: 1