BillyJohnPeters
BillyJohnPeters

Reputation: 13

Pytorch neural network (probably) does not learn

my homework is to train a network on a given data set of 3000 frogs, cats and dogs. The network I built doesn't seem to improve at all. Why is that?

The training data x_train is a numpy ndarray of shape (3000,32,32,3).

class Netz(nn.Module):
    def __init__(self):
        super(Netz, self).__init__()
        self.conv1 = nn.Conv2d(3,28,5)
        self.conv2 = nn.Conv2d(28,100,5)
        self.fc1 = nn.Linear(2500,120)
        self.fc2 = nn.Linear(120,3)

    def forward(self, x):
        x = self.conv1(x)
        x = F.max_pool2d(x,2)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.max_pool2d(x,2)
        x = F.relu(x)
        x = x.view(-1,2500)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.log_softmax(x)


model = Netz()
optimizer = optim.SGD(model.parameters(), lr=0.1, momentum=0.8)

def train(epoch):
    model.train()
    avg_loss = 0
    correct = 0
    criterion = F.nll_loss
    for i in range(len(x_train)):
        optimizer.zero_grad()
        x = torch.tensor(x_train[i])
        x = x.permute(2, 0, 1)
        x = Variable(x)
        x = x.unsqueeze(0)
        target = Variable(torch.Tensor([y_train[i]]).type(torch.LongTensor))
        out = model(x)
        loss = criterion(out, target)
        avg_loss += loss
        pred = out.argmax(dim=1, keepdim=True)
        correct += pred.eq(target.view_as(pred)).sum().item()
        loss.backward()
        optimizer.step()
        if i%64==0:
            print("epoch ", epoch, " [", i, "/", len(x_train), "] average loss: ", avg_loss.item() / 64, " correct: ", correct, "/64")
            avg_loss = 0
            correct = 0

I expect the mean error to decrease over time, but it seems to keep fluctuating around the same number...

Upvotes: 1

Views: 1104

Answers (1)

Rex Low
Rex Low

Reputation: 2167

Your loss is fluctuating means your network is not powerful enough to extract meaningful embeddings. I can recommend trying one of these few things.

  1. Add more layers.
  2. Use a smaller learning rate.
  3. Use a larger dataset or use a pre-trained model if you only have a small dataset.
  4. Normalize your dataset.
  5. Shuffle training set.
  6. Play with hyperparameters.

Upvotes: 1

Related Questions