Reputation: 99
I would like to save a Tensorflow model to ml-engine on GCP, and to make an online prediction.
I have successfully created the model on the ml-engine, however, I am struggling to make the input JSON string feed into the model.
Here is the code and data, credit goes to Jose Portilla from his Tensorflow course on Udemy.
I have used gcloud commend for prediction:
gcloud ml-engine predict --model='lstm_test' --version 'v3' --json-instances ./test.json
test.json content:
{"inputs":[1,2,3,4,5,6,7,8,9,10,11,12]}
Errors that I got:
{ "error": "Prediction failed: Error during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details=\"You must feed a value for placeholder tensor 'Placeholder_2' with dtype float and shape [?,12,1]\n\t [[Node: Placeholder_2 = Placeholder_output_shapes=[[?,12,1]], dtype=DT_FLOAT, shape=[?,12,1], _device=\"/job:localhost/replica:0/task:0/device:CPU:0\"]]\")" }
Upvotes: 0
Views: 933
Reputation: 8389
Generally speaking, using an example proto as input is not the preferred method for using the CloudML service. Instead, we'll directly use a placeholder.
Also, generally speaking, you should create a clean serving graph, so I would also suggest the following change:
def build_graph(x):
# All the code shared between training and prediction, given input x
...
outputs = ...
# Make sure they both have a Saver.
saver = tf.train.Saver()
return outputs, saver
# Do training
with tf.Graph().as_default() as prediction_graph:
x = tf.placeholder(tf.float32, [None, num_time_steps, num_inputs])
outputs, saver = build_graph(x)
with tf.Session(graph=prediction_graph) as sess:
session.run([tf.local_variables_initializer(), tf.tables_initializer()])
saver.restore(session, latest)
# This is a much simpler interface for saving models.
tf.saved_model.simple_save(
sess,
export_dir=SaveModel_folder,
inputs={"x": x},
outputs={"y": outputs}
)
Now, the file you use with gcloud
should look something like this:
[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]
[[2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2]]
This sends a batch of two instances (one instance/example per line), and assumes num_inputs
is 4 and num_time_steps
is 3.
One more important caveat, gcloud's file format is slightly different than the full body of the request you would send if you were using a traditional client to send a request (e.g. JS, Python, curl, etc.). The body of the request corresponding to the same file above is:
{
"instances": [
[[1, 1, 1, 1], [1, 1, 1, 1], [1, 1, 1, 1]]
[[2, 2, 2, 2], [2, 2, 2, 2], [2, 2, 2, 2]]
]
}
Basically, each line in the gcloud
file becomes an entry in the "instances" array.
Upvotes: 3