Reputation: 2264
Pytorch version 0.3.1
EDIT: I am rewriting this question to be simpler, as I've narrowed down the bug.
I have some variables:
x = ag.Variable(torch.ones(1, 1), requires_grad = True)
y = ag.Variable(torch.ones(1, 1), requires_grad = True)
z = ag.Variable(torch.ones(1, 1), requires_grad = True)
I then create a variable representing their concatenation:
w = torch.cat([x, y, z])
f = x + y + z
Then I try to take derivatives:
ag.grad(f, x, retain_graph=True, create_graph=True)
This is fine and returns 1, as expected. Same for y and z.
However,
ag.grad(f, w, retain_graph=True, create_graph=True)
Returns an error: RuntimeError: differentiated input is unreachable
Of course that makes sense - w is not explicitly used in the declaration of f
. However, I’d like a behavior where one line of code can generate something like [1; 1; 1]
as output.
Let’s say I wanted to conveniently batch my variables together, and then take the gradient of the whole shebang at once, rather than processing variables independently (which can make bookkeeping a nightmare). Is there any way to get the outcome I desire?
Upvotes: 1
Views: 941
Reputation: 2751
Does something like this work or you want to keep f = x + y + z
?
w = torch.cat([x, y, z])
f = w[0] + w[1] + w[2]
print (ag.grad(f, w, retain_graph=True, create_graph=True))
# output (tensor([[ 1.],[ 1.],[ 1.]]),)
Upvotes: 2