Dadeslam
Dadeslam

Reputation: 201

Gradient of neural network with respect to inputs

I am working on a NN with Pytorch which simply maps points from the plane into real numbers, for example

model = nn.Sequential(nn.Linear(2,2),nn.ReLU(),nn.Linear(2,1))

What I want to do, since this network defines a map h:R^2->R, is to compute the gradient of this mapping h in the training loop. So for example

 for it in range(epochs):
      pred = model(X_train)
      grad = torch.autograd.grad(pred,X_train)
      ....

The training set has been defined as a tensor requiring the gradient. My problem is that even if the output, for each fixed point, is a scalar, since I am propagating a set of N=100 points, the output is actually a Nx1 tensor. This brings to the error: autograd can compute the gradient just of scalar functions.

In fact, trying with the little change

pred = torch.sum(model(X_train))

everything works perfectly. However I am interested in all the single gradients so, is there a way to compute all these gradients together?

Actually computing the sum as presented above gives exactly the same result I expect of course, but I wanted to know if this is the only possiblity.

Upvotes: 1

Views: 1149

Answers (1)

Umang Gupta
Umang Gupta

Reputation: 16470

There are other possibilities but using .sum is the simplest way. Using .sum() on the final loss vector and computing dpred/dinput will give you the desired output. Here is why:

Since, pred = sum(loss) = sum (f(xi))

where i is the index of input x.

dpred/dinput will be a matrix [dpred/dx0, dpred/dx1, dpred/dx...]

Consider, dpred/dx0, it will be equal to df(x0)/dx0, since other df(xi)/dx0 is 0.

PS: Please excuse the crappy mathematical expressions... SO does not support latex/math expressions.

Upvotes: 0

Related Questions