Reputation: 93
I am trying to get derivative of output of a Keras model with respect to the input (x) of the model (not the weights). It seems like the easiest way is to use "gradients" from keras.backend which returns a tensor of gradients (https://keras.io/backend/). I am new with tensorflow and not comfortable with it yet. I have got the gradient tensor, and trying to get numerical values for it for different values of input (x). But it seems like the gradient value is independent of the input x (which is not expected to be) or I am doing something wrong. Any help or comment will be appreciated.
import keras
import numpy as np
import matplotlib.pyplot as plt
from keras.layers import Dense, Dropout, Activation
from keras.models import Sequential
import keras.backend as K
import tensorflow as tf
%matplotlib inline
n = 100 # sample size
x = np.linspace(0,1,n) #input
y = 4*(x-0.5)**2 #output
dy = 8*(x-0.5) #derivative of output wrt the input
model = Sequential()
model.add(Dense(32, input_dim=1, activation='relu')) # 1d input
model.add(Dense(32, activation='relu'))
model.add(Dense(1)) # 1d output
# Minimize mse
model.compile(loss='mse', optimizer='adam', metrics=["accuracy"])
model.fit(x, y, batch_size=10, epochs=1000, verbose=0)
gradients = K.gradients(model.output, model.input) #Gradient of output wrt the input of the model (Tensor)
print(gradients)
#value of gradient for the first x_test
x_test_1 = np.array([[0.2]])
sess = tf.Session()
sess.run(tf.global_variables_initializer())
evaluated_gradients_1 = sess.run(gradients[0], feed_dict={model.input:
x_test_1})
print(evaluated_gradients_1)
#value of gradient for the second x_test
x_test_2 = np.array([[0.6]])
evaluated_gradients_2 = sess.run(gradients[0], feed_dict={model.input: x_test_2})
print(evaluated_gradients_2)
output of my code:
[<tf.Tensor 'gradients_1/dense_7/MatMul_grad/MatMul:0' shape=(?, 1) dtype=float32>]
[[-0.21614937]]
[[-0.21614937]]
evaluated_gradients_1 and evaluated_gradients_2 are different for different runs, but always equal! I expected them to be different for the same run, because they are for different values of input (x). Output of the network seems to be correct. Here's a plot of the network output: Output of the network vs. true value
Upvotes: 4
Views: 7468
Reputation: 93
Here's the answer:
sess = tf.Session()
sess.run(tf.global_variables_initializer())
should be replaced by:
sess = K.get_session()
The former creates a new tensorflow session and initializes all the values, that's why it gives random values as the output of gradient function. The latter pulls out the session which was used inside the Keras, which has after training values.
Upvotes: 5