Jing Gu
Jing Gu

Reputation: 519

What is the difference between model.training = False and model.param.require_grad = False

What is the difference between these two:

model.training = False

and

for param in model.parameters():
    param.require_grad = False

Upvotes: 1

Views: 1823

Answers (1)

Anubhav Singh
Anubhav Singh

Reputation: 8709

model.training = False sets the module in evaluation mode, i.e.,

if model.training == True:
    # Train mode
if model.training == False:
    # Evaluation mode

So, effectively layers like dropout, batchnorm etc. which behave different on the train and test procedures know what is going on and hence can behave accordingly.

while

for param in model.parameters():
    param.require_grad = False

freeze the layers so that these layers are not trainable.

The basic idea is that all models have a function model.children() which returns it’s layers. Within each layer, there are parameters (or weights), which can be obtained using .param() on any children (i.e. layer). Now, every parameter has an attribute called requires_grad which is by default True. True means it will be backpropagrated and hence to freeze a layer you need to set requires_grad to False for all parameters of a layer.

Upvotes: 3

Related Questions