Reputation: 1386
Transformers also provide their own schedulers for learning rates like get_constant_schedule
, get_constant_schedule_with_warmup
, etc. They are again returning torch.optim.lr_scheduler.LambdaLR
(torch scheduler). Is the warmup_steps
the only difference between the two?
How can we create a custom transformer-based scheduler similar to other torch schedulers like lr_scheduler.MultiplicativeLR
, lr_scheduler.StepLR
, lr_scheduler.ExponentialLR
?
Upvotes: 1
Views: 1243
Reputation: 1706
You can create a custom scheduler by just creating a function in a class that takes in an optimizer and its state dicts and edits the values in its param_groups.
To understand how to structure this in a class, just take a look at how Pytorch creates its schedulers and use the same functions just change the functionality to your liking.
The Permalink I found that will be a good reference is over here
This is like a template you can use
from torch.optim import lr_scheduler
class MyScheduler(lr_scheduler._LRScheduler # Optional inheritance):
def __init__(self, # optimizer, epoch, step size, whatever you need as input to lr scheduler, you can even use vars from LRShceduler Class that you can inherit from etc.):
super(MyScheduler, self).__init__(optimizer, last_epoch, verbose)
# Put variables that you will need for updating scheduler like gamma, optimizer, or step size etc.
self.optimizer = optimizer
def get_lr(self):
# How will you use the above variables to update the optimizer
for group in self.optimizer.param_groups:
group["lr"] = # Fill this out with updated optimizer
return self.optimizer
You can add more functions for increased functionality. Or you can just use a function to update your learning rate. This will take in the optimizer and change it optimizer.param_groups[0]["lr"] and return the new optimizer.
Upvotes: 1