Reputation: 145
I recently hit some performance bottlenecks with symbolic matrix derivatives in Sympy (specifically, the single line of code evaluating symbolic matrices via substitution using lambdas was taking ~90% of the program's runtime), so I decided to give Theano a go.
Its previous application was evaluating the partial derivatives over the hyperparameters of a Gaussian process, where using a (1, k) dimension matrix of Sympy symbols (MatrixSymbol) worked nicely in terms of iterating over this list and differentiating the matrix on each item.
However, this doesn't carry over into Theano, and the documentation doesn't seem to detail how to do this. Indexing a symbolic vector in Theano returns the Subtensor type, which is invalid for calculating the gradient on.
Below is a simple (but entirely algorithmically incorrect - stripped down to the functionality I'm trying to obtain) version of what I'm attempting to do.
EDIT: I have modified the code sample to include the data as a tensor to be passed into the function as suggested below, and included an alternate attempt at instead using a list of separate scalar tensors as I cannot index the values of a symbolic Theano vector, though also to no avail.
import theano
import numpy as np
# Sample data
data = np.array(10*np.random.rand(5, 3), dtype='int64')
# Not including data as tensor, incorrect/invalid indexing of symbolic vector
l_scales_sym = theano.tensor.dvector('l_scales')
x = theano.tensor.dmatrix('x')
f = x/l_scales_sym
f_eval = theano.function([x, l_scales_sym], f)
df_dl = theano.gradient.jacobian(f.flatten(), l_scales_sym[0])
df_dl_eval = theano.function([x, l_scales_sym], df_dl)
The second last line of the code snippet is where I am trying to get a partial derivative over one of the elements in the list of 'length scale' variables, but this sort of indexing is inapplicable to the symbolic vectors.
Any help would be greatly appreciated!
Upvotes: 0
Views: 573
Reputation: 679
When using theano, all variables should be defined as theano tensors (or shared variables); otherwise, the variable does not become part of the computational graph. In f = data/l_scales_sym
the variable data
is a numpy array. Try to also define it as a a tensor, it should work.
Upvotes: 2