Changing states of Keras stateful RNN model, layers and methods after creating an Estimator

Question

What is the benefit of using tf.keras.estimator.model_to_estimator over a stand-alone keras model? For example, when we wish to serve the model in real time?

Let's take a this example. I have a Keras RNN, which is a stateful model. This means that when live data comes in for prediction, I need to do the following steps:

Reset the model state
Set the states from our last prediction for this user (if this is an old user)
Run predict(x=x) and also save the states outputed, for future predictions for this user.

In Keras, I do these steps using:

old_states = [state_h, state_c]
lstm_layer = model.get_layer('lstm')
lstm_layer.reset_states(states=old_states)
pred = model.predict(x=x)
new_states_to_save = [pred[1], pred[2]]

However, how does one do this procedure using an estimator? That is, on the: tf.keras.estimator.model_to_estimator(model) object?

How can I access individual layers and how can I access the .reset_states() method?

Model

num_input = tf.keras.layers.Input(shape=(None, no_of_features), name='num_input', batch_size=1)
lstm, state_h, state_c = tf.keras.layers.LSTM(units=320,
                                            return_sequences=True,
                                            return_state=True,
                                            stateful=True,
                                            name='lstm')(num_input)

dense = tf.keras.layers.Dense(1, activation='sigmoid', name='main_output')(lstm_3)

model = tf.keras.models.Model(num_input, [dense, state_h, state_c])

Edit Estimator Layers

Szymon Maszke · Accepted Answer

Few points about benefits of `tf.Estimator`

What is the benefit of using tf.keras.estimator.model_to_estimator over a stand-alone keras model? For example, when we wish to serve the model in real time?

Well, I would rather add my two cents instead of copying documentation:

You can run Estimator-based models on a local host or on a distributed multi-server environment without changing your model. Furthermore, you can run Estimator-based models on CPUs, GPUs, or TPUs without recoding your model.

Well, Keras models can run on CPU and GPU "without recoding" as well. There is truth about distributed training, if you need it, it may be worthwhile to go with tf.Estimator hassle. Furthermore, as Tensorflow 2.0 is coming I wouldn't count on this high-level API so much. Direction is rather clear, and Tensorflow will become more Keras and PyTorch oriented (with it's tf.Eager high-level API when it comes to the the second framework), tf.Estimator does not really fit the bill with it's function-oriented design.

Estimators simplify sharing implementations between model developers.

What can I say, they don't, just look into SavedModel docs. Using tf.SavedModel, exporting the model created via tf.Estimator is even more fun, just to give you a good look how 'easy' it is:

feature_spec = {'foo': tf.FixedLenFeature(...),
                'bar': tf.VarLenFeature(...)}

def serving_input_receiver_fn():
  """An input receiver that expects a serialized tf.Example."""
  serialized_tf_example = tf.placeholder(dtype=tf.string,
                                         shape=[default_batch_size],
                                         name='input_example_tensor')
  receiver_tensors = {'examples': serialized_tf_example}
  features = tf.parse_example(serialized_tf_example, feature_spec)
  return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

Oh, and don't forget this documentation will not tell you how to load this model and use it afterwards (you can load into current session for example, provided you know the name of input and output nodes, so easy to share those models, love it).

You can develop a state of the art model with high-level intuitive code. In short, it is generally much easier to create models with Estimators than with the low-level TensorFlow APIs.

This point was covered, indeed tf.Estimator is more intuitive than low-level Tensorflow, but I doubt it's any success in the face of tf.keras. Still, passing three different modes, it's pointlessly functional-oriented design (+ all the fun with exporting), would get me saying it's a mid-level API (always good to have multiple APIs)

Estimators are themselves built on tf.keras.layers, which simplifies customization.

Well, that was tf.layers in 1.9 or 1.8, but that's being deprecated, so that's that when it comes to good practices with Tensorflow on the long run.

All in all: I'm not very much into serving (can't waste my time on the next unintuitive code with names like tf.estimator.export.build_raw_serving_input_receiver_fn), but you would be better off avoiding it if possible due to it's poor design.

Predictions could probably be done with Keras model as well, which would save you some time, but that's just my opinion.

Accessing individual layers

First of all: tf.Estimator is not like Keras models!

How can I access individual layers and how can I access the .reset_states() method?

Well, this is where the fun begins. You have to get your model within current session (e.g. loading exported tf.Estimator) and iterate over operations in the graph.

Schematically it looks something like this:

with tf.Session() as session:
    # Of course, your tag can be different
    tf.saved_model.loader.load(session, 
                               tf.saved_model.tag_constants.SERVING, 
                               "/here/is/mymodel/exported/with/SavedModel")
    graph = tf.get_default_graph()
    # Here are all the layers of your tf.Estimator, sorted in the order they exist
    # At least they were two versions back
    operations = graph.get_operations()
    # https://www.tensorflow.org/api_docs/python/tf/contrib/framework/get_variables this should work...
    variables = tf.contrib.get_variables()

What can you do with those operations? Those have quite readable names, maybe you could modify it that way (and reset rnn state). Check here after obtaining your ops and vars.

It's a long-shot though, as I have not seen such use cases unfortunately. I think that would be just about it when it comes to 'simplified customization'.

Predictions

Well, a little easier (?), you just feed the graph inside session after loading your model, just like low-level Tensorflow:

output_names = "your_output_operation"
input_names = "your_input_operation"

with tf.Session() as session:

    # Of course, your tag can be different
    tf.saved_model.loader.load(session, 
                               tf.saved_model.tag_constants.SERVING, 
                               "/here/is/mymodel/exported/with/SavedModel")
    x = obtain_your_example_as_numpy_array()
    results = session.run(output_names, feed_dict={input_names: x})

From what I recall, you can specify multiple output names, this direction may be a viable solution. To obtain input and output names, you can use SavedModel CLI or print the operations and get the ones specifying input. Usually, those will be named like: input_1:0 (for explanation of naming convention you can check this) for input, and predictions/Softmax:0 for outputs (if it's a multiclass classification). You outputs names will vary based on exported model specifications, exact layer etc.

I hope this post helps you at least a little bit

PS. I think the best you can do is to leave tf.Estimator alone, according to my knowledge it's unusable and looks like a bunch of dirty hacks thrown together code-wise.

Changing states of Keras stateful RNN model, layers and methods after creating an Estimator

Model

Answers (2)

Few points about benefits of `tf.Estimator`

Accessing individual layers

Predictions

I hope this post helps you at least a little bit

Related Questions

Changing states of Keras stateful RNN model, layers and methods after creating an Estimator

Model

Answers (2)

Few points about benefits of tf.Estimator

Accessing individual layers

Predictions

I hope this post helps you at least a little bit

Related Questions

Few points about benefits of `tf.Estimator`