Reputation: 461
I am new to Pytorch, and I am trying to do some importance sampling experiments: During an evaluation epoch, I calculate the loss for each training sample, and obtain the sum of gradients for this training sample. Finally, I will sort the training samples based on gradients they introduced. For example, if sample A shows a very high gradient sum, it must be an important sample to training. Otherwise, it is not a very important sample.
Note that, the gradients calculated here will not be used to update parameters. In other words, they are only used for selecting importance samples.
I know gradients will be ready somewhere after loss.backward(). But what is the easiest way to grab the summed gradients over the entire model? In my current implementation, I am only allowed to modify one small module with only loss availble, so I don’t have “inputs” or “model”. Is it possible to get the gradients from only “loss”?
Upvotes: 1
Views: 2002
Reputation: 2307
Gradients after backward are stored as the grad
attribute of tensors that require grad. You can find all tensors involved and sum up their grad
s. A cleaner way might be to write a backward hook to accumulate gradients to some global variable while backpropagating.
An example is
import torch
import torch.nn as nn
model = nn.Linear(5, 3)
print(model.weight.grad) # None, since the grads have not been computed yet
print(model.bias.grad)
x = torch.randn(5, 5)
y = model(x)
loss = y.sum()
loss.backward()
print(model.weight.grad)
print(model.bias.grad)
output:
None
None
tensor([[-0.6164, 1.1585, -3.4117, -4.3192, -3.7273],
[-0.6164, 1.1585, -3.4117, -4.3192, -3.7273],
[-0.6164, 1.1585, -3.4117, -4.3192, -3.7273]])
tensor([5., 5., 5.])
As you see, you can access the gradients as param.grad
. If model
is an torch.nn.Module
object, you can iterate over its parameters with for param in model.parameters()
.
Maybe you can also work with backward hooks but I am not that familiar with them to give a code example.
Upvotes: 2