Alcarin
Alcarin

Reputation: 13

Weights not updating on my neural net (Pytorch)

I'm completely new to neural nets, so I tried to roughly follow some tutorials to create a neural net that can just distinguish if a given binary picture contains a white circle or if it is all black. So, I generated 1000 arrays of size 10000 representing a 100x100 picture with half of them containing a white circle somewhere. The generation of my dataset looks like this:

for i in range(1000):
   image = [0] * (IMAGE_SIZE * IMAGE_SIZE)

   if random() < 0.5:
      dataset.append([image, [[0]]])

   else:
      #inserts circle in image
      #...

      dataset.append([image, [[1]]])

np.random.shuffle(dataset)
np.save("testdataset.npy", dataset)

The double list around the classifications is because the net seemed to give that format as an output, so I matched that.

Now since I don't really have any precise idea of how pytorch works, I don't really now which parts of the code are relevant for solving my problem and which aren't. Therefore, I gave the code for the net and the training down below and really hope that someone can explain to me where I went wrong. I'm sorry if it's too much code. The code runs without errors, but if I print the parameters before and after training they didn't change in any way and the net will always just return a 0 for every image/array.

IMAGE_SIZE = 100
EPOCHS = 3
BATCH_SIZE = 50
VAL_PCT = 0.1

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(IMAGE_SIZE * IMAGE_SIZE, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 1)
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return F.log_softmax(x, dim = 1)
    
net = Net()
optimizer = optim.Adam(net.parameters(), lr = 0.01)
loss_function = nn.MSELoss()
dataset = np.load("testdataset.npy", allow_pickle = True)

X = torch.Tensor([i[0] for i in dataset]).view(-1, 10000)
y = torch.Tensor([i[1] for i in dataset])

val_size = int(len(X) * VAL_PCT)

train_X = X[:-val_size]
train_y = y[:-val_size]

test_X = X[-val_size:]
test_y = y[-val_size:]

for epoch in range(EPOCHS):
    for i in range(0, len(train_X), BATCH_SIZE):
        batch_X = train_X[i:i + BATCH_SIZE].view(-1, 1, 10000)
        batch_y = train_y[i:i + BATCH_SIZE]

        net.zero_grad()

        outputs = net(batch_X)
        loss = loss_function(outputs, batch_y)
        loss.backward()
        optimizer.step()

Upvotes: 1

Views: 1683

Answers (2)

Victor Zuanazzi
Victor Zuanazzi

Reputation: 1984

Instead of net.zero_grad() I would recommend using optimizer.zero_grad() as it's more common and de facto standard. Your training loop should be:

for epoch in range(EPOCHS):
    for i in range(0, len(train_X), BATCH_SIZE):
        batch_X = train_X[i:i + BATCH_SIZE].view(-1, 1, 10000)
        batch_y = train_y[i:i + BATCH_SIZE]

        optimizer.zero_grad()

        outputs = net(batch_X)
        loss = loss_function(outputs, batch_y)
        loss.backward()
        optimizer.step()

I would recommend you reading a bit about different loss functions. It seems you have a classification problem, for that you should use the logits (binary classification) or cross entropy (multi class) loss. I would make the following changes to the network and loss function:

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(IMAGE_SIZE * IMAGE_SIZE, 64)
        self.fc2 = nn.Linear(64, 64)
        self.fc3 = nn.Linear(64, 64)
        self.fc4 = nn.Linear(64, 1)
        
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return x
    
loss_function = nn.BCEWithLogitsLoss()

Check the documentation before using it: https://pytorch.org/docs/stable/nn.html#bcewithlogitsloss

Good luck!

Upvotes: 3

Nivesh Gadipudi
Nivesh Gadipudi

Reputation: 506

  1. First, It is not ideal to use Neural networks for address this kind of problems. Neural Networks train on highly non-linear data. For this example, You can use average intensities of image to find out a white pixel is present or not

  2. However, A classic logistic regression problem outputs a value from 0 to 1 or probabilities

  3. Softmax function is used when you have multiple classes and convert all the sum of classes equal to 1

  4. log_softmax implementation: log( exp(x_i) / exp(x).sum() ). Here, your output layer consists of only 1 neuron. outputs = net(batch_X) is always 1.

Upvotes: 0

Related Questions