dendog
dendog

Reputation: 3338

Run a Vertex AI two tower model locally

I have successfully trained a Two Tower model on Google Vertex AI as per the guide here.

I now would like to download the model and try some inference locally on my own machine, I have been battling with various errors for a while and now am stuck at the following:

Code:

import tensorflow as tf
import tensorflow_text


load_options = tf.saved_model.LoadOptions(experimental_io_device='/job:localhost')
tf.saved_model.load('model_path', options=load_options)

Error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in _get_op_def(self, type)
   3957     try:
-> 3958       return self._op_def_cache[type]
   3959     except KeyError:

KeyError: 'IO>DecodeJSON'

During handling of the above exception, another exception occurred:

NotFoundError                             Traceback (most recent call last)
~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in load_internal(export_dir, tags, options, loader_cls, filters)
    905         loader = loader_cls(object_graph_proto, saved_model_proto, export_dir,
--> 906                             ckpt_options, filters)
    907       except errors.NotFoundError as err:

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in __init__(self, object_graph_proto, saved_model_proto, export_dir, ckpt_options, filters)
    133         function_deserialization.load_function_def_library(
--> 134             meta_graph.graph_def.library))
    135     self._checkpoint_options = ckpt_options

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/saved_model/function_deserialization.py in load_function_def_library(library, load_shared_name_suffix)
    357     with graph.as_default():
--> 358       func_graph = function_def_lib.function_def_to_graph(copy)
    359     _restore_gradient_functions(func_graph, renamed_functions)

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/framework/function_def_to_graph.py in function_def_to_graph(fdef, input_shapes)
     63   graph_def, nested_to_flat_tensor_name = function_def_to_graph_def(
---> 64       fdef, input_shapes)
     65 

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/framework/function_def_to_graph.py in function_def_to_graph_def(fdef, input_shapes)
    227     else:
--> 228       op_def = default_graph._get_op_def(node_def.op)  # pylint: disable=protected-access
    229 

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/framework/ops.py in _get_op_def(self, type)
   3962         pywrap_tf_session.TF_GraphGetOpDef(self._c_graph, compat.as_bytes(type),
-> 3963                                            buf)
   3964         # pylint: enable=protected-access

NotFoundError: Op type not registered 'IO>DecodeJSON' in binary running on 192.168.1.105. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
<ipython-input-2-39fe5910a28b> in <module>
      5 
      6 load_options = tf.saved_model.LoadOptions(experimental_io_device='/job:localhost')
----> 7 tf.saved_model.load('query_model/20220219125209', options=load_options)

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in load(export_dir, tags, options)
    867     ValueError: If `tags` don't match a MetaGraph in the SavedModel.
    868   """
--> 869   return load_internal(export_dir, tags, options)["root"]
    870 
    871 

~/.pyenv/versions/3.7.8/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py in load_internal(export_dir, tags, options, loader_cls, filters)
    907       except errors.NotFoundError as err:
    908         raise FileNotFoundError(
--> 909             str(err) + "\n If trying to load on a different device from the "
    910             "computational device, consider using setting the "
    911             "`experimental_io_device` option on tf.saved_model.LoadOptions "

FileNotFoundError: Op type not registered 'IO>DecodeJSON' in binary running on 192.168.1.105. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
 If trying to load on a different device from the computational device, consider using setting the `experimental_io_device` option on tf.saved_model.LoadOptions to the io_device such as '/job:localhost'.

The issue seems to be the fact that the model was trained user albert-base and there are some extra ops and packages needed for it to run, that is why I import tensorflow_text I have also tried to import tensorflow_io but I receive an error just trying to load the package, stating an S3 filesystem has already been registered.

Any help would be greatly appreciated!

Upvotes: 2

Views: 557

Answers (1)

Zhimo Shen
Zhimo Shen

Reputation: 31

Two tower model is trained with tensorflow 2.3 and tensorflow io 0.15.0 You need to use the correct version otherwise it can't be load. Also import tensorflow_io before you actually load the model

Upvotes: 3

Related Questions