Rex Low
Rex Low

Reputation: 2157

requires_grad of params is True even with torch.no_grad()

I am experiencing a strange problem with PyTorch today.

When checking network parameters in the with scope, I am expecting requires_grad to be False, but apparently this is not the case unless I explicitly set all params myself.

Code

Link to Net -> Gist

net = InceptionResnetV2()

with torch.no_grad():

    for name, param in net.named_parameters():
        print("{} {}".format(name, param.requires_grad))

The above code will tell me all the params are still requiring grad, unless I explicitly specify param.requires_grad = False.

My torch version: 1.0.1.post2

Upvotes: 2

Views: 2903

Answers (1)

cantordust
cantordust

Reputation: 1612

torch.no_grad() will disable gradient information for the results of operations involving tensors that have their requires_grad set to True. So consider the following:

import torch

net = torch.nn.Linear(4, 3)

input_t = torch.randn(4)

with torch.no_grad():

    for name, param in net.named_parameters():
        print("{} {}".format(name, param.requires_grad))

    out = net(input_t)

    print('Output: {}'.format(out))

    print('Output requires gradient: {}'.format(out.requires_grad))
    print('Gradient function: {}'.format(out.grad_fn))

This prints

weight True
bias True
Output: tensor([-0.3311,  1.8643,  0.2933])
Output requires gradient: False
Gradient function: None

If you remove with torch.no_grad(), you get

weight True
bias True
Output: tensor([ 0.5776, -0.5493, -0.9229], grad_fn=<AddBackward0>)
Output requires gradient: True
Gradient function: <AddBackward0 object at 0x7febe41e3240>

Note that in both cases the module parameters have requires_grad set to True, but in the first case the out tensor doesn't have a gradient function associated with it whereas it does in the second case.

Upvotes: 5

Related Questions