Torch automatic differentiation for matrix defined with components of vector

Question

The title is quite self-explanatory. I have the following

import torch

x = torch.tensor([3., 4.], requires_grad=True)
A = torch.tensor([[x[0], x[1]],
                  [x[1], x[0]]], requires_grad=True)

f = torch.norm(A)
f.backward()

I would like to compute the gradient of f with respect to x, but if I type x.grad I just get None. If I use the more explicit command torch.autograd.grad(f, x) instead of f.backward(), I get

RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

Prezt · Accepted Answer

The problem might be, that when you take a slice of a leaf tensor, it returns a non-leaf tensor like so:

>>> x.is_leaf
True
>>> x[0].is_leaf
False

So what's happening is that x is not what was added to the graph, but instead x[0].

Try this instead:

>>> import torch
>>> 
>>> x = torch.tensor([3., 4.], requires_grad=True)
>>> xt = torch.empty_like(x).copy_(x.flip(0))
>>> A = torch.stack([x,xt])
>>> 
>>> f = torch.norm(A)
>>> f.backward()
>>> 
>>> x.grad
tensor([0.8485, 1.1314])

The difference is that PyTorch knows to add x to the graph, so f.backward() populates it's gradient. Here you'll find a few different way of copying tensors and the effect it has on the graph.

Torch automatic differentiation for matrix defined with components of vector

Answers (1)

Related Questions