Reputation: 143
I have a trained LSTM model with 1 LSTM Layer and 3 Dense layers. I am using it for a sequence to One prediction. I have 4 input variables and 1 output variable. I am using the values of the last 20 timesteps to predict the next value of my output variable. The architecture of the model is shown below
model = Sequential()
model.add(LSTM(units = 120, activation ='relu', return_sequences = False,input_shape =
(train_in.shape[1],5)))
model.add(Dense(100,activation='relu'))
model.add(Dense(50,activation='relu'))
model.add(Dense(1))
The shapes of training input and training output are as shown below
train_in.shape , train_out.shape
((89264, 20, 5), (89264,))
I want to calculate the jacobian matrix for this model. Say, Y = f(x1,x2,x3,x4) is the representation of the above neural network where: Y -- Output variable of the trained model, f -- Is the function representing the Model; x1,x2,x3,x4 --input parameters.
How can I calculate the Jacobian Matrix?? Please share your thoughts on this. Also any valuable references if you know any.
Thank you :)
Upvotes: 0
Views: 570
Reputation: 143
I found a way to get the Jacobian matrix for LSTM model output with respect to the input. I am posting it here so that it might help someone in the future. Please share if there is any better or more simple way to do the same
import numpy as np
import pandas as pd
import tensorflow as tf
tf.compat.v1.enable_eager_execution() #This will enable eager execution which is must.
tf.executing_eagerly() #check if eager execution is enabled or not. Should give "True"
data = pd.read_excel("FileName or Location ")
#My data is in the from of dataframe with 127549 rows and 5 columns(127549*5)
a = data[:20] #shape is (20,5)
b = data[50:70] # shape is (20,5)
A = [a,b] # making a list
A = np.array(A) # convert into array size (2,20,5)
At = tf.convert_to_tensor(A, np.float32) #convert into tensor
At.shape # TensorShape([Dimension(2), Dimension(20), Dimension(5)])
model = load_model('EKF-LSTM-1.h5') # Load the trained model
# I have a trained model which is shown in the question above.
# Output of this model is a single value
with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
tape.watch(At)
y1 = model(At) #defining your output as a function of input variables
print(y1,type(y1)
#output
tf.Tensor([[0.04251503],[0.04634088]], shape=(2, 1), dtype=float32) <class
'tensorflow.python.framework.ops.EagerTensor'>
jacobian=tape.jacobian(y1,At) #jacobian of output w.r.t both inputs
jacobian.shape
TensorShape([Dimension(2), Dimension(1), Dimension(2), Dimension(20), Dimension(5)])
Here I calculated Jacobian w.r.t 2 inputs each of size (20,5). If you want to calculate w.r.t to only one input of size (20,5), then use this
jacobian=tape.jacobian(y1,At[0]) #jacobian of output w.r.t only 1st input in 'At'
jacobian.shape
TensorShape([Dimension(1), Dimension(1), Dimension(1), Dimension(20), Dimension(5)])
Upvotes: 1
Reputation: 111
you might want to take a look at tf.GradientTape
in tensorflow. Gradient tape is very simple way to auto-differentiate your computation. And the link has some basic example.
However your model is already quite big. If you have n
parameters, your jacobian will have n*n
values. I believe your model probably already has more than 10000 parameters. You might need to make it smaller.
Upvotes: 1