Adithya Samavedhi
Adithya Samavedhi

Reputation: 81

Get gradients with respect to inputs in Keras ANN model

bce = tf.keras.losses.BinaryCrossentropy()
ll=bce(y_test[0], model.predict(X_test[0].reshape(1,-1)))
print(ll)
<tf.Tensor: shape=(), dtype=float32, numpy=0.04165391>
print(model.input)
<tf.Tensor 'dense_1_input:0' shape=(None, 195) dtype=float32>
model.output
<tf.Tensor 'dense_3/Sigmoid:0' shape=(None, 1) dtype=float32>
grads=K.gradients(ll, model.input)[0]
print(grads)
None

So here i have Trained a 2 hidden layer neural network, input has 195 features and output is 1 size. I wanted to feed the neural network with validation instances named as X_test one by one with their correct labels in y_test and for each instance calculate the gradients of the output with respect to input, the grads upon printing gives me a None. Your help is appreciated.

Upvotes: 5

Views: 1630

Answers (1)

Saleh
Saleh

Reputation: 169

One can do this using tf.GradientTape. I wrote the following code to learn a sin wave, and get its derivative in the spirit of this question. I think, it should be possible to extend the following codes in order to compute partial derivatives.
Importing the needed libraries:

import numpy as np
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import losses
import tensorflow as tf

Create the data:

x = np.linspace(0, 6*np.pi, 2000)
y = np.sin(x)

Defining a Keras NN:

def model_gen(Input_shape):
    X_input = Input(shape=Input_shape)
    X = Dense(units=64, activation='sigmoid')(X_input)
    X = Dense(units=64, activation='sigmoid')(X)
    X = Dense(units=1)(X)
    model = Model(inputs=X_input, outputs=X)
    return model

Training the model:

model = model_gen(Input_shape=(1,))
opt = Adam(lr=0.01, beta_1=0.9, beta_2=0.999, decay=0.001)
model.compile(loss=losses.mean_squared_error, optimizer=opt)
model.fit(x,y, epochs=200)

To obtain the gradient of the network w.r.t. the input:

x = list(x)
x = tf.constant(x)
with tf.GradientTape() as t:
  t.watch(x)
  y = model(x)

dy_dx = t.gradient(y, x)

dy_dx.numpy()

One can further visualise dy_dx to make sure of how smooth the derivative is. Finally, note that one get a smoother derivative when one uses a smooth activation (e.g. sigmoid) instead of Relu as noted here.

Upvotes: 5

Related Questions