Reputation: 137
I am using Hugginface's Trainer.
How to adjust the learning rate after N number of epochs?
For example, I have an initial learning rate set to lr=2e-6
, and I would like to change the learning rate to lr=1e-6
after the first epoch and stay on it the rest of the training.
I tried this so far:
optimizer = AdamW(model.parameters(),
lr = 2e-5,
eps = 1e-8
)
epochs = 5
batch_number = len(small_train_dataset) / 8
total_steps = batch_number * epochs
scheduler = get_linear_schedule_with_warmup(optimizer,
num_warmup_steps = 0,
num_training_steps = total_steps,
last_epoch=-1
)
I know that there is https://pytorch.org/docs/stable/generated/torch.optim.lr_scheduler.LambdaLR.html#torch.optim.lr_scheduler.LambdaLR but here it drops learning rate every epoch but that is not what i want to do. I want it to drop after 1 epoch and then stay on it rest of the training process.
Upvotes: 1
Views: 2339
Reputation: 19550
You could define your own learning rate scheduler with Lambda_LR from pytorch:
import torch
initial_lr = 2e-6
num_update_steps_per_epoch = len(train_dataloader) #that is a pytorch dataloader
#initial_lr * 0.5 = 1e-6
lambda_lr = lambda current_step: 0.5 if current_step<=num_update_steps_per_epoch else 1
lr_scheduler = torch.optim.lr_scheduler.LambdaLR(
optimizer=optimizer,
lr_lambda = lambda_lr
)
Upvotes: 0
Reputation: 80
You could train in two steps, first, train with desired initial learning rate then create a second optimizer with the final learning rate. It is equivalent.
Upvotes: 1
Reputation: 1396
Try to use scheduler like this:
scheduler = get_constant_schedule_with_warmup(optimizer, num_warmup_steps = N / batch_size)
where N
is number of epochs after which you want to use the constant lr.
This will increase your lr
from 0
to initial_lr
specified in your optimizer in num_warmup_steps
, after which it becomes constant.
I have an initial learning rate set to lr=2e-6, and I would like to change the learning rate to lr=1e-6 after the first epoch
Looks like transformer schedulers don't give any option to start the warmup from a pre-specified learning rate (e.g. 2e-6
in your case). You could start the learning rate from 0
instead as shown above.
Upvotes: 0