Reputation: 2157
I am experiencing a strange problem with PyTorch today.
When checking network parameters in the with
scope, I am expecting requires_grad
to be False
, but apparently this is not the case unless I explicitly set all params myself.
Code
Link to Net -> Gist
net = InceptionResnetV2()
with torch.no_grad():
for name, param in net.named_parameters():
print("{} {}".format(name, param.requires_grad))
The above code will tell me all the params are still requiring grad, unless I explicitly specify param.requires_grad = False
.
My torch
version: 1.0.1.post2
Upvotes: 2
Views: 2903
Reputation: 1612
torch.no_grad()
will disable gradient information for the results of operations involving tensors that have their requires_grad
set to True
. So consider the following:
import torch
net = torch.nn.Linear(4, 3)
input_t = torch.randn(4)
with torch.no_grad():
for name, param in net.named_parameters():
print("{} {}".format(name, param.requires_grad))
out = net(input_t)
print('Output: {}'.format(out))
print('Output requires gradient: {}'.format(out.requires_grad))
print('Gradient function: {}'.format(out.grad_fn))
This prints
weight True
bias True
Output: tensor([-0.3311, 1.8643, 0.2933])
Output requires gradient: False
Gradient function: None
If you remove with torch.no_grad()
, you get
weight True
bias True
Output: tensor([ 0.5776, -0.5493, -0.9229], grad_fn=<AddBackward0>)
Output requires gradient: True
Gradient function: <AddBackward0 object at 0x7febe41e3240>
Note that in both cases the module parameters have requires_grad
set to True
, but in the first case the out
tensor doesn't have a gradient function associated with it whereas it does in the second case.
Upvotes: 5