MAltakrori
MAltakrori

Reputation: 324

tensorflow eager gradients_function() returns error "t is not in list"

I am using Tensorflow in Eager mode to calculate the derivatives of the softmax manually. The design of the code was easy based on the documentation provided by tensorflow. However, I can't use the gradient function properly. For some reason, when I try to run it I get the error "t is not in list", which I googled but got nothing for.

Here's my code:

theta = np.random.standard_normal((64, 4))
xs = np.zeros((available_states))
xs[some_index] = 1

def pi(xs, theta):
    H_s_a_Theta = np.matmul(tf.transpose(theta), xs) 
    softmax = [scipy.exp(x) for x in H_s_a_Theta]
    sum = np.sum(softmax)

    return softmax / sum

first_derv_fn = tfe.gradients_function(pi, params = 'theta') #xs:0, theta:1 
secon_derv_fn = tfe.gradients_function(first_derv_fn, params = 'theta')

I tried a toy example, with X * X + Y * Y * Y, and scalar inputs in particular, and that worked well, but not the code above.

Upvotes: 0

Views: 187

Answers (1)

ash
ash

Reputation: 6751

There are two things going on with your sample:

  1. As per the documentation of gradients_function, params is supposed to be a sequence of parameter indices or names. In the code snippet above, a string is being provided. Since Python strings are a sequence of characters, it thinks that the names of the parameters to differentiate against are 't', 'h', 'e', 't', and 'a', which don't exist and hence the error. Setting params=['theta'] instead will get over that hump.

  2. Additionally, it seems you're asking gradients_function to compute gradients through numpy operations, which it can't do. It can only compute gradients through TensorFlow operations. Slight tweaks to use tf operations instead of np operations in your function will address that.

The following works for me:

import tensorflow as tf
import numpy as np
tfe = tf.contrib.eager
tf.enable_eager_execution()

available_states = (64, 1)
some_index = 0

theta = np.random.standard_normal((64, 4))
xs = np.zeros(available_states)
xs[some_index] = 1

def pi(xs, theta):
    H_s_a_Theta = tf.matmul(tf.transpose(theta), xs) 
    softmax = tf.exp(H_s_a_Theta)
    sum = tf.reduce_sum(softmax)
    return softmax / sum

first_derv_fn = tfe.gradients_function(pi, params=['theta']) #xs:0, theta:1 

print(first_derv_fn(xs, theta))

(The way gradient computation works is that it records all operations on Tensor objects in the forward pass and then plays them backwards by running the backward function corresponding to each operation in the forward pass. It does not record numpy operations).

Hope that helps.

Upvotes: 1

Related Questions