How to use weigths extracted from a LSTM recurrent neural network

Question

I have trained an LSTM recurrent neural network for sequence (time series) classification with Keras in Python.

Features are collated in a shape (batch_size, timesteps, data_dim). My training examples are 1000 in total. The final goal is to have a classification among 5 classes. Here is a snippet of my code.

#defining some model features
data_dim = 15
timesteps = 20
num_classes = len(one_hot_train_labels[1,:])
batch_size = len(ytrain) 

#reshaping array for LSTM training
xtrain=numpy.reshape(xtrain, (len(ytrain), timesteps, data_dim))
xtest=numpy.reshape(xtest, (len(ytest), timesteps, data_dim))

rms = optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0) #It is recommended to leave the parameters
#of this optimizer at their default values (except the learning rate, which can be freely tuned).

# create the model
model = Sequential()
model.add(LSTM(101, dropout=0.5, recurrent_dropout=0.5, input_shape=(timesteps, data_dim), activation='tanh'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=rms, metrics=['accuracy'])
print(model.summary())
history = model.fit(xtrain, one_hot_train_labels, epochs=200, batch_size=10)
# Final evaluation of the model
scores = model.evaluate(xtrain, one_hot_train_labels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
scores = model.evaluate(xtest, one_hot_test_labels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Since I want to use and implement the classifier elsewhere, I have extracted the weights using:

weights = [model.get_weights() for layer in model.layers]

Having used traditional neural networks and logistic regression in the past, I was expecting 2 matrices for each layer, one with the polynomial weights and one with the bias units, and then using the activation function (in this case tanh and softmax function) to progressively find the probabilities of falling in one of the 5 classes.

But I am now confused because calling the weights returns me 5 matrices with the following sizes:

(15, 400)
(100, 400)
(400,)
(100,5)
(5,)

Now, I understand the LSTM works with 4 different blocks:

input from vector
memory from previous block
memory from current block
output from previous block

and hence why I have a size of 400 for the 2n dimension of my matrices.

And now my questions:

How can I use those weights in a cascade way to eventually get the class probabilities, using the activation functions (as in traditional neural networks)?

where is the bias unit for the input layer?

Thanks to everyone can help to clarify and help in understanding how to use this powerful tool as LSTM networks.

Hope this would be helpful not just for me.

How to use weigths extracted from a LSTM recurrent neural network

Answers (1)

Related Questions