Reputation: 1832
I have a bag of models that I trained and saved on a machine with a GPU. I used the following model that was trained and run on a GPU
model = Sequential()
model.add(CuDNNLSTM(units=30,input_shape=(None, 11), return_sequences=True, name='LAYER1'))
model.add(Dropout(.9, name='LAYER2'))
model.add(Dense(units=10, activation="relu",name='LAYER3'))
model.add(Dropout(.1, name='LAYER4'))
model.add(CuDNNLSTM(units=20,return_sequences=False,name='LAYER5'))
model.add(Dropout(.1, name='LAYER6'))
model.add(Dense(units=3,activation="linear",name='LEVEL7'))
rmsprop_opt = RMSprop(lr=learning_rate)
model.compile(loss="mse", optimizer=rmsprop_opt)
I save the graph of the model using:
model_json_dict = json.loads(model.to_json())
json.dump(model_json_dict, open("my_model_graph.json", "w"))
I then saved the weights using a checkpoint method:
callback_checkpoint = ModelCheckpoint(filepath="model_checkpoint.h5",
monitor='val_loss',
verbose=1,
save_weights_only=True,
save_best_only=True)
callbacks = [callback_checkpoint]
And I fit the model using:
history = model.fit(feature_train,
label_train,
batch_size=batch_size,
epochs=epochs,
validation_split=validation_split,
callbacks=callbacks)
I would like to read the model back into a machine for prediction that only has a CPU. I've loaded the model and loaded the weights as follows on a second machine and TF complains about the CPU/GPU issue.
model = model_from_json(json.dumps(json.load(open("my_model_graph.json","r"))))
model.load_weights("model_checkpoint.h5")
So the question is how do I convert these saved models and their weights into a form that can be reloaded into the second machine with only a CPU?
Its confusing about the proper method to do this. There is a SO that shows using a Saver() class. Tensorflow: how to save/restore a model?. And another post that says it cant be done, and another that says its transparent. Whats the recommended method of converting these existing models? (Retraining them is not an option!)
Upvotes: 2
Views: 1342
Reputation: 1832
This is how I solved it. My model looks something like this:
def build_model(layers, machine, learning_rate, dropout_rate):
model = Sequential() # tf.keras.models.Sequential()
if machine == "GPU":
model.add(
CuDNNLSTM(
units=layers[1],
input_shape=(None, layers[0]),
return_sequences=True,
name='FIRST_LAYER')
)
else:
model.add(
LSTM(
units=layers[1],
input_shape=(None, layers[0]),
return_sequences=True,
name='FIRST_LAYER')
)
...
I create the initial model like this:
my_model = build_model(layers=layer_sizes, machine='GPU', learning_rate=0.003,dropout_rate=0.05)
... then train the model and save the weights using:
my_model.save_weights("my_model_weights.h5")
Now I switch to the CPU instance as follows:
my_model = build_model(layers=layer_sizes, machine='CPU', learning_rate=0.003,dropout_rate=0.05)
... then I can load the saved weights as follows:
my_model.load_weights("my_model_weights.h5")
Its a shame that the model_from_json api doesn't work across machines!
Upvotes: 2