GeneC
GeneC

Reputation: 145

How to use Pytorch OneCycleLR in a training loop (and optimizer/scheduler interactions)?

I'm training an NN and using RMSprop as an optimizer and OneCycleLR as a scheduler. I've been running it like this (in slightly simplified code):

optimizer = torch.optim.RMSprop(model.parameters(), lr=0.00001, 
                              alpha=0.99, eps=1e-08, weight_decay=0.0001, momentum=0.0001, centered=False)
scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.0005, epochs=epochs)

    for epoch in range(epochs):
        model.train()
        for counter, (images, targets) in enumerate(train_loader):

            # clear gradients from last run
            optimizer.zero_grad()

            # Run forward pass through the mini-batch
            outputs = model(images)

            # Calculate the losses
            loss = loss_fn(outputs, targets)

            # Calculate the gradients
            loss.backward()

            # Update parameters
            optimizer.step()   # Optimizer before scheduler????
            scheduler.step()

            # Check loss on training set
            test()

Note the optimizer and scheduler calls in each mini-batch. This is working, though when I plot the learning rates through the training, the curve is very bumpy. I checked the docs again, and this is the example shown for torch.optim.lr_scheduler.OneCycleLR

>>> data_loader = torch.utils.data.DataLoader(...)
>>> optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9)
>>> scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=0.01, steps_per_epoch=len(data_loader), epochs=10)
>>> for epoch in range(10):
>>>     for batch in data_loader:
>>>         train_batch(...)
>>>         scheduler.step()

Here, they omit the optimizer.step() in the training loop. And I thought, that makes sense since the optimizer is provided to OneCycleLR in its initialization, so it must be taking care of that on the back end. But doing so gets me the warning:

UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.

Do I ignore that and trust the pseudocode in the docs? Well, I did, and the model didn't do any learning, so the warning is correct and I put optimizer.step() back in.

This gets to the point that I don't really understand how the optimizer and scheduler interact (edit: how the Learning Rate in the optimizer interacts with the Learning Rate in the scheduler). I see that generally the optimizer is run every mini-batch and the scheduler every epoch, though for OneCycleLR, they want you to run it every mini-batch too.

Any guidance (or a good tutorial article) would be appreciated!

Upvotes: 6

Views: 10801

Answers (1)

akshayk07
akshayk07

Reputation: 2190

Use optimizer.step() before scheduler.step(). Also, for OneCycleLR, you need to run scheduler.step() after every step - source (PyTorch docs). So, your training code is correct (as far as calling step() on optimizer and schedulers is concerned).

Also, in the example you mentioned, they have passed steps_per_epoch parameter, but you haven't done so in your training code. This is also mentioned in the docs. This might be causing the issue in your code.

Upvotes: 8

Related Questions