Snehal
Snehal

Reputation: 767

CuDNNLSTM and LSTM model weights loading, model.evaluate() issue

I trained a Bidirectional CuDNNLSTM text classification model and model.evaluate(x_test, y_test, batch_size=BATCH_SIZE) give me [('loss', 0.39137715717178684), ('acc', 0.9012292817679558)]. Which is what I was expecting!

Now, when I load the same model weights in a new Bidirectional LSTM model (to run the model on CPU) by modifying CuDNNLSTM > LSTM in my model architecture, I'm getting [('loss', 8.747908523430075), ('acc', 0.006823506011315417)] on same test data.

Any suggestion on this behavior? Do I have to update anything else when loading CuDNNLSTM model weights in LSTM model? I did the same thing for my other model and it worked without any issues.

Also, parameters numbers seems to be different in CuDNNLSTM and LSTM layer, even though everything is same!

CuDNNLSTM: CuDNNLSTM

LSTM: LSTM

Upvotes: 2

Views: 972

Answers (1)

Snehal
Snehal

Reputation: 767

Issue solved! Check this thread on Keras: https://github.com/keras-team/keras/issues/9463

It was about recurrent_activation in LSTM layer. If you load weights from CuDNNLSTM layer to LSTM layer, make sure to use 'sigmoid' activation function with LSTM!

Upvotes: 2

Related Questions