Jiangang yang
Jiangang yang

Reputation: 89

About autograd in pyorch, Adding new user-defined layers, how should I make its parameters update?

everyone !

My demand is a optical-flow-generating problem. I have two raw images and a optical flow data as ground truth, now my algorithm is to generate optical flow using raw images, and the euclidean distance between generating optical flow and ground truth could be defined as a loss value, so it can implement a backpropagation to update parameters.

I take it as a regression problem, and I have to ideas now:

I can set every parameters as (required_grad = true), and compute a loss, then I can loss.backward() to acquire the gradient, but I don’t know how to add these parameters in optimizer to update those.

I write my algorithm as a model. If I design a “custom” model, I can initilize several layers such as nn.Con2d(), nn.Linear() in def init() and I can update parameters in methods like (torch.optim.Adam(model.parameters())), but if I define new layers by myself, how should I add this layer’s parameters in updating parameter collection???

This problem has confused me several days. Are there any good methods to update user-defined parameters? I would be very grateful if you could give me some advice!

Upvotes: 1

Views: 152

Answers (1)

Jatentaki
Jatentaki

Reputation: 13103

Tensor values have their gradients calculated if they

  1. Have requires_grad == True
  2. Are used to compute some value (usually loss) on which you call .backward().

The gradients will then be accumulated in their .grad parameter. You can manually use them in order to perform arbitrary computation (including optimization). The predefined optimizers accept an iterable of parameters and model.parameters() does just that - it returns an iterable of parameters. If you have some custom "free-floating" parameters you can pass them as

my_params = [my_param_1, my_param_2]
optim = torch.optim.Adam(my_params)

and you can also merge them with the other parameter iterables like below:

model_params = list(model.parameters())
my_params = [my_param_1, my_param_2]
optim = torch.optim.Adam(model_params + my_params)

In practice however, you can usually structure your code to avoid that. There's the nn.Parameter class which wraps tensors. All subclasses of nn.Module have their __setattr__ overridden so that whenever you assign an instance of nn.Parameter as its property, it will become a part of Module's .parameters() iterable. In other words

class MyModule(nn.Module):
    def __init__(self):
        super(MyModule, self).__init__()
        self.my_param_1 = nn.Parameter(torch.tensor(...))
        self.my_param_2 = nn.Parameter(torch.tensor(...))

will allow you to write

module = MyModule()
optim = torch.optim.Adam(module.parameters())

and have the optim update module.my_param_1 and module.my_param_2. This is the preferred way to go, since it helps keep your code more structured

  1. You won't have to manually include all your parameters when creating the optimizer
  2. You can call module.zero_grad() and zero out the gradient on all its children nn.Parameters.
  3. You can call methods such as module.cuda() or module.double() which, again, work on all children nn.Parameters instead of requiring to manually iterate through them.

Upvotes: 1

Related Questions