Reputation: 15
import torch
import math
# Create Tensors to hold input and outputs.
x = torch.linspace(-math.pi, math.pi, 2000)
y = torch.sin(x)
# For this example, the output y is a linear function of (x, x^2, x^3), so
# we can consider it as a linear layer neural network. Let's prepare the
# tensor (x, x^2, x^3).
p = torch.tensor([1, 2, 3])
xx = x.unsqueeze(-1).pow(p)
model = torch.nn.Sequential(
torch.nn.Linear(3, 1),
torch.nn.Flatten(0, 1)
)
loss_fn = torch.nn.MSELoss(reduction='sum')
learning_rate = 1e-6
Then I print the weights
parameters = list(model.parameters())
print(parameters)
Results:
[Parameter containing:
tensor([[ 0.0407, 0.2680, -0.1148]], requires_grad=True), Parameter containing:
tensor([-0.0132], requires_grad=True)]
y_pred = model(xx)
loss = loss_fn(y_pred, y)
model.zero_grad()
loss.backward()
Updating the weights:
with torch.no_grad():
for param in model.parameters():
param -= 1e-6 * param.grad
and then
list(model.parameters())
[Parameter containing:
tensor([[ 0.0532, 0.2472, -0.0393]], requires_grad=True),
Parameter containing:
tensor([-0.0167], requires_grad=True)]
The weights were updated. I got confused. How is that possible? I thought just param variable in the for loop changed, not model.parameters().
But when change the code a bit
with torch.no_grad():
for param in model.parameters():
param -= 1e-6
The weights didn't change. So I guess it related to param.grad. Can you guys explain to me about that?
Upvotes: 1
Views: 5526
Reputation: 114786
param
variable inside the loop references each element of model.parameters()
. Thus, updating param
is the same as updating the elements of model.parameters()
.
As for your second example, I think decrementing by 1e-6
is just not enough for you to see the effect. Try param -= 1.
and see if this has any effect on model.parameters()
.
Upvotes: 1