Moran Reznik
Moran Reznik

Reputation: 1371

Difference between freezing layer with requires_grad and not passing params to optim in PyTorch

Let's say I train an autoencoder. I want to freeze the parameters of the encoder for the training, so only the decoder trains.

I can do this using:

# assuming it's a single layer called 'encoder'
model.encoder.weights.data.requers_grad = False 

Or I can pass only the decoder's parameters to the optimizer. Is there a difference?

Upvotes: 1

Views: 1010

Answers (1)

Ivan
Ivan

Reputation: 40708

The most practical way is to iterate through all parameters of the module you want to freeze and set required_grad to False. This gives you the flexibility to switch your modules on and off without having to initialize a new optimizer each time. You can do this using the parameters generator available on all nn.Modules:

for param in module.parameters():
    param.requires_grad = False

This method is model agnostic since you don't have to worry whether your module contains multiple layers or sub-modules.


Alternatively, you can call the function nn.Module.requires_grad_ once as:

module.requires_grad_(False)

Upvotes: 2

Related Questions