Reputation: 561
The documentation is pretty vague and there aren't example codes to show you how to use it. The documentation for it is
Add a param group to the Optimizer s param_groups.
This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the Optimizer as training progresses.
Parameters: param_group (dict) – Specifies what Tensors should be optimized along with group optimization options. (specific) –
I am assuming I can get a param_group
parameter by feeding the values I get from a model's state_dict()
? E.g. all the actual weight values? I am asking this because I want to make a progressive network, which means I need to constantly feed Adam parameters from newly created convolutions and activations modules.
Upvotes: 13
Views: 15059
Reputation: 4513
Per the docs, the add_param_group
method accepts a param_group
parameter that is a dict
. Example of use:
import torch
import torch.optim as optim
w1 = torch.randn(3, 3)
w1.requires_grad = True
w2 = torch.randn(3, 3)
w2.requires_grad = True
o = optim.Adam([w1])
print(o.param_groups)
gives
[{'amsgrad': False,
'betas': (0.9, 0.999),
'eps': 1e-08,
'lr': 0.001,
'params': [tensor([[ 2.9064, -0.2141, -0.4037],
[-0.5718, 1.0375, -0.6862],
[-0.8372, 0.4380, -0.1572]])],
'weight_decay': 0}]
now
o.add_param_group({'params': w2})
print(o.param_groups)
gives:
[{'amsgrad': False,
'betas': (0.9, 0.999),
'eps': 1e-08,
'lr': 0.001,
'params': [tensor([[ 2.9064, -0.2141, -0.4037],
[-0.5718, 1.0375, -0.6862],
[-0.8372, 0.4380, -0.1572]])],
'weight_decay': 0},
{'amsgrad': False,
'betas': (0.9, 0.999),
'eps': 1e-08,
'lr': 0.001,
'params': [tensor([[-0.0560, 0.4585, -0.7589],
[-0.1994, 0.4557, 0.5648],
[-0.1280, -0.0333, -1.1886]])],
'weight_decay': 0}]
Upvotes: 21