CuDNNLSTM and LSTM model weights loading, model.evaluate() issue

Question

I trained a Bidirectional CuDNNLSTM text classification model and model.evaluate(x_test, y_test, batch_size=BATCH_SIZE) give me [('loss', 0.39137715717178684), ('acc', 0.9012292817679558)]. Which is what I was expecting!

Now, when I load the same model weights in a new Bidirectional LSTM model (to run the model on CPU) by modifying CuDNNLSTM > LSTM in my model architecture, I'm getting [('loss', 8.747908523430075), ('acc', 0.006823506011315417)] on same test data.

Any suggestion on this behavior? Do I have to update anything else when loading CuDNNLSTM model weights in LSTM model? I did the same thing for my other model and it worked without any issues.

Also, parameters numbers seems to be different in CuDNNLSTM and LSTM layer, even though everything is same!

CuDNNLSTM:

LSTM:

Snehal · Accepted Answer

Issue solved! Check this thread on Keras: https://github.com/keras-team/keras/issues/9463

It was about recurrent_activation in LSTM layer. If you load weights from CuDNNLSTM layer to LSTM layer, make sure to use 'sigmoid' activation function with LSTM!

CuDNNLSTM and LSTM model weights loading, model.evaluate() issue

Answers (1)

Related Questions