Mehran
Mehran

Reputation: 16841

How to tell Tensorflow Serving to leave my named output untouched?

I want to create a model signature that returns named tensors (using Keras). This is what I mean by that. When the model is deployed to TF-Serving, I want it to return a JSON like this:

{
    "predictions": [
        {
            "t3": 19,
            "t1": 76.975174,
            "t2": "cat3"
        },
        {
            "t3": 17,
            "t1": 77.7983246,
            "t2": "cat3"
        }
    ]
}

The important part is t1, t2, and t3. These are named by me. If I hadn't named them, the returned JSON would have been this:

{
    "predictions": [
        {
            "output_0": 77.5714188,
            "output_1": "cat3",
            "output_2": 17
        },
        {
            "output_0": 80.7243729,
            "output_1": "cat4",
            "output_2": 17
        }
    ]
}

The output_0, output_1, and output_2 are automatically generated by some component (not sure which one but I guess TF or Keras). I kinda managed to pull this off but only in specific scenarios.

This is what I have so far:

from tensorflow.keras import layers


class OutputWithNames(layers.Layer):
    def __init__(self):
        super(OutputWithNames, self).__init__()

    def call(self, x):
        return {"t1": x[0], "t2": x[1], "t3": x[2]}

Adding this custom layer as the last one in my model's signature, I get exactly what I mentioned. A JSON object with desired property names t1, t2, and t3.

Inspecting the saved model using the saved_model_cli tool, this is what I get as my output signature when the resulting model works as expected:

The given SavedModel SignatureDef contains the following output(s):
  outputs['t1'] tensor_info:
      dtype: DT_FLOAT
      shape: (-1)
      name: StatefulPartitionedCall_1:0
  outputs['t2'] tensor_info:
      dtype: DT_STRING
      shape: (-1)
      name: StatefulPartitionedCall_1:1
  outputs['t3'] tensor_info:
      dtype: DT_INT32
      shape: (-1)
      name: StatefulPartitionedCall_1:2

And again, so far everything's fine. But when I have only one single output:

from tensorflow.keras import layers


class OutputWithNames(layers.Layer):
    def __init__(self):
        super(OutputWithNames, self).__init__()

    def call(self, x):
        return {"t1": x}

Note: In the first code block, x is a list of Tensors since I have multiple outputs. That's why I can/have to use x[0]. But in the second code block, x is just simply a Tensor since there's only one output. That's why there's no [] in front of the x.

This time, TF-Serving will generate this JSON for me:

{
    "predictions": [
        67.2723083,
        68.9468231
    ]
}

While this is what I was expecting to see:

{
    "predictions": [
        {
            "t1": 67.2723083
        },
        {
            "t1": 68.9468231
        }
    ]
}

And this is what saved_model_cli tool returns for it:

The given SavedModel SignatureDef contains the following output(s):
  outputs['t1'] tensor_info:
      dtype: DT_INT64
      shape: (-1)
      name: StatefulPartitionedCall:0

Basically, the saved model has the name for the output correctly set. But for some reason, TF-Serving strips it. And instead of returning an array of objects, it returns only the values.

My question is how can I force the TF-Serving to return a list of objects when there's only a single output?

Upvotes: 4

Views: 497

Answers (1)

Kyle F. Hartzenberg
Kyle F. Hartzenberg

Reputation: 3680

The answer is that you can't without changes to the way TF Serving is implemented. In the TF Serving documentation it states that:

... If the output of the model contains only one named tensor, we omit the name and predictions key maps to a list of scalar or list values. If the model outputs multiple named tensors, we output a list of objects instead, similar to the request in row-format mentioned above. ...

If you don't know what model you are going to be using beforehand (and thus the expected output), my suggestion would be to:

  1. Parse the returned JSON predictions object.
  2. Check whether the JSON array contains values or objects.
    1. If it returns values, you know that the output of the model used contained only one named tensor, and so you can enact whatever logic you were intending to use for "t1" on its own.
    2. If it returns objects, you know that the output of the model used contained multiple named tensors, each of which will have the names you already set (i.e. "t3", "t1", "t2").

Alternatively, if you can add an additional (arbitrary) Tensor to the output, for example, by defining a specific Predict SignatureDef this may allow you to force TF Serving to return a list of objects, but obviously one of those objects will be arbitrary which you can then ignore by name. To quote from the link prior:

Predict SignatureDefs also allow you to add optional additional Tensors to the outputs, that you can explicitly query. Let's say that in addition to the output key below of scores, you also wanted to fetch a pooling layer for debugging or other purposes. In that case, you would simply add an additional Tensor with a key like pool and appropriate value.

signature_def: {
  key  : "my_prediction_signature"
  value: {
    inputs: {
      key  : "images"
      value: {
        name: "x:0"
        dtype: ...
        tensor_shape: ...
      }
    }
    outputs: {
      key  : "scores"
      value: {
        name: "y:0"
        dtype: ...
        tensor_shape: ...
      }
    }
    method_name: "tensorflow/serving/predict"
  }
}

I am not familiar with defining or implementing such signatures, but hopefully that offers another direction to look into if the basic parsing approach mentioned prior is not feasible given your situation.

Upvotes: 1

Related Questions