kkgarg
kkgarg

Reputation: 1386

Difference between transformers schedulers and Pytorch schedulers

Transformers also provide their own schedulers for learning rates like get_constant_schedule, get_constant_schedule_with_warmup, etc. They are again returning torch.optim.lr_scheduler.LambdaLR (torch scheduler). Is the warmup_steps the only difference between the two?

How can we create a custom transformer-based scheduler similar to other torch schedulers like lr_scheduler.MultiplicativeLR, lr_scheduler.StepLR, lr_scheduler.ExponentialLR?

Upvotes: 1

Views: 1243

Answers (1)

SarthakJain
SarthakJain

Reputation: 1706

You can create a custom scheduler by just creating a function in a class that takes in an optimizer and its state dicts and edits the values in its param_groups.

To understand how to structure this in a class, just take a look at how Pytorch creates its schedulers and use the same functions just change the functionality to your liking.

The Permalink I found that will be a good reference is over here

EDIT After comments:

This is like a template you can use

from torch.optim import lr_scheduler

class MyScheduler(lr_scheduler._LRScheduler # Optional inheritance):
    def __init__(self, # optimizer, epoch, step size, whatever you need as input to lr scheduler, you can even use vars from LRShceduler Class that you can inherit from etc.):
      super(MyScheduler, self).__init__(optimizer, last_epoch, verbose)
      # Put variables that you will need for updating scheduler like gamma, optimizer, or step size etc.
      self.optimizer = optimizer

    def get_lr(self):
       # How will you use the above variables to update the optimizer
       for group in self.optimizer.param_groups:
          group["lr"] = # Fill this out with updated optimizer

       return self.optimizer

You can add more functions for increased functionality. Or you can just use a function to update your learning rate. This will take in the optimizer and change it optimizer.param_groups[0]["lr"] and return the new optimizer.

Upvotes: 1

Related Questions