PyTorch does not converge when approximating square function with linear model

Question

I'm trying to learn some PyTorch and am referencing this discussion here

The author provides a minimum working piece of code that illustrates how you can use PyTorch to solve for an unknown linear function that has been polluted with random noise.

This code runs fine for me.

However, when I change the function such that I want t = X^2, the parameter does not seem to converge.

import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable

# Let's make some data for a linear regression.
A = 3.1415926
b = 2.7189351
error = 0.1
N = 100 # number of data points

# Data
X = Variable(torch.randn(N, 1))

# (noisy) Target values that we want to learn.
t = X * X + Variable(torch.randn(N, 1) * error)

# Creating a model, making the optimizer, defining loss
model = nn.Linear(1, 1)
optimizer = optim.SGD(model.parameters(), lr=0.05)
loss_fn = nn.MSELoss()

# Run training
niter = 50
for _ in range(0, niter):
    optimizer.zero_grad()
    predictions = model(X)
    loss = loss_fn(predictions, t)
    loss.backward()
    optimizer.step()

    print("-" * 50)
    print("error = {}".format(loss.data[0]))
    print("learned A = {}".format(list(model.parameters())[0].data[0, 0]))
    print("learned b = {}".format(list(model.parameters())[1].data[0]))

When I execute this code, the new A and b parameters are seemingly random thus it does not converge. I think this should converge because you can approximate any function with a slope and offset function. My theory is that I'm using PyTorch incorrectly.

Can any identify a problem with my t = X * X + Variable(torch.randn(N, 1) * error) line of code?

Shai · Accepted Answer

You cannot fit a 2nd degree polynomial with a linear function. You cannot expect more than random (since you have random samples from the polynomial).
What you can do is try and have two inputs, x and x^2 and fit from them:

model = nn.Linear(2, 1)  # you have 2 inputs now
X_input = torch.cat((X, X**2), dim=1)  # have 2 inputs per entry
# ...

    predictions = model(X_input)  # 2 inputs -> 1 output
    loss = loss_fn(predictions, t)
    # ...
    # learning t = c*x^2 + a*x + b
    print("learned a = {}".format(list(model.parameters())[0].data[0, 0]))
    print("learned c = {}".format(list(model.parameters())[0].data[0, 1])) 
    print("learned b = {}".format(list(model.parameters())[1].data[0]))

PyTorch does not converge when approximating square function with linear model

Answers (1)

Related Questions