Michele
Michele

Reputation: 69

How to use weigths extracted from a LSTM recurrent neural network

I have trained an LSTM recurrent neural network for sequence (time series) classification with Keras in Python.

Features are collated in a shape (batch_size, timesteps, data_dim). My training examples are 1000 in total. The final goal is to have a classification among 5 classes. Here is a snippet of my code.

#defining some model features
data_dim = 15
timesteps = 20
num_classes = len(one_hot_train_labels[1,:])
batch_size = len(ytrain) 

#reshaping array for LSTM training
xtrain=numpy.reshape(xtrain, (len(ytrain), timesteps, data_dim))
xtest=numpy.reshape(xtest, (len(ytest), timesteps, data_dim))

rms = optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0) #It is recommended to leave the parameters
#of this optimizer at their default values (except the learning rate, which can be freely tuned).

# create the model
model = Sequential()
model.add(LSTM(101, dropout=0.5, recurrent_dropout=0.5, input_shape=(timesteps, data_dim), activation='tanh'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer=rms, metrics=['accuracy'])
print(model.summary())
history = model.fit(xtrain, one_hot_train_labels, epochs=200, batch_size=10)
# Final evaluation of the model
scores = model.evaluate(xtrain, one_hot_train_labels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
scores = model.evaluate(xtest, one_hot_test_labels, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

Since I want to use and implement the classifier elsewhere, I have extracted the weights using:

weights = [model.get_weights() for layer in model.layers]

Having used traditional neural networks and logistic regression in the past, I was expecting 2 matrices for each layer, one with the polynomial weights and one with the bias units, and then using the activation function (in this case tanh and softmax function) to progressively find the probabilities of falling in one of the 5 classes.

But I am now confused because calling the weights returns me 5 matrices with the following sizes:

Now, I understand the LSTM works with 4 different blocks:

  1. input from vector
  2. memory from previous block
  3. memory from current block
  4. output from previous block

and hence why I have a size of 400 for the 2n dimension of my matrices.

And now my questions:

How can I use those weights in a cascade way to eventually get the class probabilities, using the activation functions (as in traditional neural networks)?

where is the bias unit for the input layer?

Thanks to everyone can help to clarify and help in understanding how to use this powerful tool as LSTM networks.

Hope this would be helpful not just for me.

Upvotes: 0

Views: 87

Answers (1)

nuric
nuric

Reputation: 11225

I'm guessing you want the class predictions when you say get the class probabilities (?) You can use model.predict() to get the class probabilities after you trained the network. Preferably you would like to model.save_weights(filename) and then model.load_weights(filename) when you want to predict. The Input layers don't have bias, you can see how many parameters your layers have with model.summary()

Upvotes: 0

Related Questions