PyTorch gives incorrect results due to broadcasting

Question

I want to run some neural net experiments with PyTorch, but a minimal test case is giving wrong answers. The test case sets up a simple neural network with two input variables and an output variable that is just the sum of the inputs, and tries learning it as a regression problem; I expect it to converge on zero mean squared error, but it actually converges on 0.165. It's probably because of the issue alluded to in the warning message; how can I fix it?

Code:

import torch
import torch.nn as nn

# data
Xs = []
ys = []
n = 10
for i in range(n):
    i1 = i / n
    for j in range(n):
        j1 = j / n
        Xs.append([i1, j1])
        ys.append(i1 + j1)

# torch tensors
X_tensor = torch.tensor(Xs)
y_tensor = torch.tensor(ys)

# hyperparameters
in_features = len(Xs[0])
hidden_size = 100
out_features = 1
epochs = 500

# model
class Net(nn.Module):
    def __init__(self, hidden_size):
        super(Net, self).__init__()
        self.L0 = nn.Linear(in_features, hidden_size)
        self.N0 = nn.ReLU()
        self.L1 = nn.Linear(hidden_size, 1)

    def forward(self, x):
        x = self.L0(x)
        x = self.N0(x)
        x = self.L1(x)
        return x


model = Net(hidden_size)
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

# train
print("training")
for epoch in range(1, epochs + 1):
    # forward
    output = model(X_tensor)
    cost = criterion(output, y_tensor)

    # backward
    optimizer.zero_grad()
    cost.backward()
    optimizer.step()

    # print progress
    if epoch % (epochs // 10) == 0:
        print(f"{epoch:6d} {cost.item():10f}")
print()

output = model(X_tensor)
cost = criterion(output, y_tensor)
print("mean squared error:", cost.item())

Output:

training
C:\Users
usse\Anaconda3\envs	orch2\lib\site-packages	orch
n\modules\loss.py:445: UserWarning: Using a target size (torch.Size([100])) that is different to the input size (torch.Size([100, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)
    50   0.167574
   100   0.165108
   150   0.165070
   200   0.165052
   250   0.165039
   300   0.165028
   350   0.165020
   400   0.165013
   450   0.165009
   500   0.165006

mean squared error: 0.1650056540966034

And the message:

UserWarning: Using a target size (torch.Size([100])) that is different to the input size (torch.Size([100, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

nablag · Accepted Answer

You're going to be a bit more specific on which tensors (X, or Y), but we can can reshape our tensors by using the torch.view function.

For example:

Y_tensor = torch.tensor(Ys)
print(Y_tensor.shape)
>> torch.Size([5])

new_shape = (len(Ys), 1)
Y_tensor = Y_tensor.view(new_shape)
print(Y_tensor.shape)
>> torch.Size([5, 1])

However, I'm skeptical that this broadcasting behavior is why you're having accuracy issues.

PyTorch gives incorrect results due to broadcasting

Answers (1)

Related Questions