Reputation: 921
Given a vector v
of length say 30
, can auto differentiation tools in say theano or tensorflow be able to take the gradient of something like this:
x = np.random.rand(5, 1)
v = f(x, z)
w = v[0:25].reshape(5, 5)
y = g(np.matmul(w, x) + v[25:30])
minimize ( || y - x || )
Would this even make sense? The way I picture it in my mind I would have to do some multiplications by identity vectors/matrices with trailing 0's to convert v --> w
Upvotes: 1
Views: 1486
Reputation: 101
Let us take a look at the source code:
@ops.RegisterGradient("Reshape")
def _ReshapeGrad(op, grad):
return [array_ops.reshape(grad, array_ops.shape(op.inputs[0])), None]
This is how tensorflow automatically differentiates.
Upvotes: 0
Reputation: 57973
Slice and reshape operations fit into standard reverse mode AD framework in the same way as any other op. Below is a simple TensorFlow program that is similar to the example you gave (I had to change a couple of things to make dimensions match), and the resulting computation graph for the gradient
def f(x, z):
"""Adds values together, reshapes into vector"""
return tf.reshape(x+z, (5,))
x = tf.Variable(np.random.rand(5, 1))
z = tf.Variable(np.random.rand(5, 1))
v = f(x, z)
w = tf.slice(v, 0, 5)
w = tf.reshape(v, (5, 1))
y = tf.matmul(tf.reshape(w, (5, 1)), tf.transpose(x)) + tf.slice(v, 0, 5)
cost = tf.square(tf.reduce_sum(y-x))
print tf.gradients(cost, [x, z])
Upvotes: 4