Reputation: 9701
In PyTorch I have configured SGD like this:
sgd_config = {
'params' : net.parameters(),
'lr' : 1e-7,
'weight_decay' : 5e-4,
'momentum' : 0.9
}
optimizer = SGD(**sgd_config)
My requirements are:
So for 100 epochs I will get two times a decrease of 0.1
of my learning rate.
I read about learning rate scheduler, available in torch.optim.lr_scheduler
so I decided to try using that instead of manually adjusting the learning rate:
scheduler = lr_scheduler.StepLR(optimizer, step_size=30, last_epoch=60, gamma=0.1)
However I am getting
Traceback (most recent call last):
File "D:\Projects\network\network_full.py", line 370, in <module>
scheduler = lr_scheduler.StepLR(optimizer, step_size=30, last_epoch=90, gamma=0.1)
File "D:\env\test\lib\site-packages\torch\optim\lr_scheduler.py", line 367, in __init__
super(StepLR, self).__init__(optimizer, last_epoch, verbose)
File "D:\env\test\lib\site-packages\torch\optim\lr_scheduler.py", line 39, in __init__
raise KeyError("param 'initial_lr' is not specified "
KeyError: "param 'initial_lr' is not specified in param_groups[0] when resuming an optimizer"
I read a post here and I still don't get how I would use the scheduler for my scenario. Maybe I am just not understanding the definition of last_epoch
given that the documentation is very brief on this parameter:
last_epoch (int) – The index of last epoch. Default:
-1
.
Since the argument is made available to the user and there is no explicit prohibition on using a scheduler for less epochs than the optimizer itself, I am starting to think it's a bug.
Upvotes: 3
Views: 7670
Reputation: 71
last_epoch
must be -1 unless you are resuming the training.
If you are attempting to resume the training, the problem is you created scheduler before loading the optimizer params.
Correct procedure to resume:
sgd_config = {
'params' : net.parameters(),
'lr' : 1e-7,
'weight_decay' : 5e-4,
'momentum' : 0.9
}
optimizer = SGD(**sgd_config)
optimizer.load_state_dict(torch.load('your_save_optimizer_params.pt'))
scheduler = lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1, last_epoch=50)
Upvotes: 7
Reputation: 1
sgd_config = {
'params' : net.parameters(),
**'initial_lr': 1e-7**,
'lr' : 1e-7,
'weight_decay' : 5e-4,
'momentum' : 0.9
}
optimizer = SGD(**sgd_config)
Upvotes: 0
Reputation: 6655
A: You have to specify last_epoch=60 as a separate command, in diff format:
<< scheduler = lr_scheduler.StepLR(optimizer, step_size=30, last_epoch=60, gamma=0.1)
>> scheduler = lr_scheduler.StepLR(optimizer, step_size=30, gamma=0.1)
>> scheduler.last_epoch = 60
Use this to inspect scheduler values:
print(scheduler.state_dict())
{'step_size': 30, 'gamma': 0.1, 'base_lrs': [0.0002], 'last_epoch': 4, '_step_count': 5, 'verbose': False, '_get_lr_called_within_step': False, '_last_lr': [0.0002]}
Upvotes: 0
Reputation: 43
You have misunderstood the last_epoch argument and you are not using the correct learning rate scheduler for your requirements.
This should work:
optim.lr_scheduler.MultiStepLR(optimizer, [0, 30, 60], gamma=0.1, last_epoch=args.current_epoch - 1)
The last_epoch argument makes sure to use the correct LR when resuming training. It defaults to -1, so the epoch before epoch 0.
Upvotes: 1