Ali Hasan Khan
Ali Hasan Khan

Reputation: 25

model.parameters() not updating in Linear Regression with Pytorch

I'm a newbie in Deep Learning with Pytorch. I am using the Housing Prices dataset from Kaggle here. I tried sampling with first 50 rows. But the model.parameters() is not updating as I perform the training. Can anyone help?

import torch
import numpy as np
from torch.utils.data import TensorDataset
import torch.nn as nn
from torch.utils.data import DataLoader
import torch.nn.functional as F

inputs = np.array(label_X_train[:50])
targets = np.array(train_y[:50])

# Tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
targets = targets.view(-1, 1)
train_ds = TensorDataset(inputs, targets)
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)

model = nn.Linear(10, 1)
# Define Loss func
loss_fn = F.mse_loss
# Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-5)


num_epochs = 100
model.train()         
for epoch in range(num_epochs):
    # Train with batches of data
    for xb, yb in train_dl:

        # 1. Generate predictions
        pred = model(xb.float())

        # 2. Calculate loss
        loss = loss_fn(pred, yb.float())
    
        # 3. Compute gradients
        loss.backward()

        # 4. Update parameters using gradients
        opt.step()

        # 5. Reset the gradients to zero
        opt.zero_grad()
     
    if (epoch+1) % 10 == 0:
        print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch +
                                                   1, num_epochs, 
                                                   loss.item()))  

Upvotes: 0

Views: 2225

Answers (1)

Youyoun
Youyoun

Reputation: 338

The weight does update, but you weren't capturing it correctly. model.weight.data is a torch tensor, but the name of the variable is just a reference, so setting w = model.weight.data does not create a copy but another reference to the object. Hence changing model.weight.data would change w too.

So by setting w = model.weight.data and w_new = model.weight data in different part of the loops means you're assigning two reference to the same object making their value equal at all time.

In order to assess that the model weight are changing, either print(model.weight.data) before and after the loop (since you got one linear layer of 10 parameters it's still okay to do that) or simply set w = model.weight.data.clone(). In that case your output will be:

tensor([[False, False, False, False, False, False, False, False, False, False]])

Here's an example that shows you that your weights are changing:

import torch
import numpy as np
from torch.utils.data import TensorDataset
import torch.nn as nn
from torch.utils.data import DataLoader
import torch.nn.functional as F

inputs = np.random.rand(50, 10)
targets = np.random.randint(0, 2, 50)

# Tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
targets = targets.view(-1, 1)
train_ds = TensorDataset(inputs, targets.squeeze())
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)

model = nn.Linear(10, 1)
# Define Loss func
loss_fn = F.mse_loss
# Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-1)


num_epochs = 100
model.train()
w = model.weight.data.clone()         
for epoch in range(num_epochs):
    # Train with batches of data
    for xb, yb in train_dl:

        # 1. Generate predictions
        pred = model(xb.float())

        # 2. Calculate loss
        loss = loss_fn(pred, yb.float())
    
        # 3. Compute gradients
        loss.backward()

        # 4. Update parameters using gradients
        opt.step()

        # 5. Reset the gradients to zero
        opt.zero_grad()

    if (epoch+1) % 10 == 0:
        print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch +
                                                   1, num_epochs, 
                                                   loss.item()))
print(w == model.weight.data)

Upvotes: 2

Related Questions