Why is my parameter not changing and its gradient 0?

Question

I am building a really simple model to learn the parameter in a Poisson model and I am not sure where I am going wrong. I am using pytorch.nn and doing the following.

I made some really simple fake data

# This is the value I am trying to estimate

x = torch.tensor(2.0)


# This is a value drawn from the Poisson(x) distribution 
# In this example it is 4

y = torch.poisson(x).reshape(1)

Then I just set up a really simple model

# I initialised the parameter that is going to estimate x with a random value (0.2) 
# and set that it requires a gradient

a = torch.tensor([0.2], requires_grad = True)


# I define the loss function with log_input set to false

loss_function = torch.nn.PoissonNLLLoss(log_input = False)


# Defined the model

def model(a):
    return torch.poisson(a)

# And the parameter to be optimised 
# I chose SGD arbitrarily, maybe this is the problem?

optimizer = torch.optim.SGD([a], lr = 0.1)

Then I do iterations to update a

for i in range(2000):
    
    # Forward pass

    y_pred = model(a)


    # Compute the loss

    loss = loss_function(y_pred, y)
    

    # Backprop

    optimizer.zero_grad()
    
    loss.backward()
    
    
    # Update parameters

    optimizer.step()

The problem is after this the a is still 0.2 and if I call a.grad it is 0. Where am I going wrong?

Thanks in advance

UPDATE

I have tried instead to initiate a class for the model inheriting a nn.Module. However the same problem persists :

class learning_model(nn.Module):
    
    def __init__(self):
        super().__init__()
        self.a = nn.Parameter(torch.rand(1))
        self.a.requires_grad = True
        
    def forward(self):
        return torch.poisson(self.a)

model = learning_model()

loss_function = nn.PoissonNLLLoss(log_input = False)

optimizer = torch.optim.SGD(model.parameters(), lr = 0.1)

print(model.a)

Outputs:

Parameter containing:
tensor([0.1402], requires_grad=True)

Then:

for i in range(20):
    
    # Forward pass
    y_pred = model()
    
    # Compute the loss
    loss = loss_function(y_pred, y)
    
    # Backprop
    optimizer.zero_grad()
    
    loss.backward()    
    
    # Update parameters
    optimizer.step()
    
print(model.a, '
 gradient:', model.a.grad)

Outputs:

Parameter containing:
tensor([0.1402], requires_grad=True) 
 gradient: tensor([0.])

Why is my parameter not changing and its gradient 0?

Answers (1)

Related Questions