AI_NA
AI_NA

Reputation: 346

Weird reults in training AlexNet

I’m trying to train a dataset with AlexNet model. The task is multiclass classification (15 classes). I am wondering why I am getting very low accuracy. I tried different learning rate but has not been improved.

Here is the snippet for the training.

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)  
#optimizer = optim.Adam(model.parameters(), lr=1e-2)  # 1e-3, 1e-8

def train_valid_model():

  num_epochs=5

since = time.time()
out_loss = open("history_loss_AlexNet_exp1.txt", "w")
out_acc = open("history_acc_AlexNet_exp1.txt", "w")

losses=[]
ACCes =[]
#losses = {}

for epoch in range(num_epochs):  # loop over the dataset multiple times
    print('Epoch {}/{}'.format(epoch, num_epochs - 1))
    print('-' * 50)        

    if epoch % 10 == 9:
       torch.save({
        'epoch': epoch + 1,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
         'loss': loss
        }, 'AlexNet_exp1_epoch{}.pth'.format(epoch+1))

    for phase in ['train', 'valid', 'test']:
        if phase == 'train':

            model.train()  
        else:
            model.eval()   

        train_loss = 0.0
        total_train = 0
        correct_train = 0

        for t_image, target, image_path in dataLoaders[phase]:
            #print(t_image.size())
            #print(target)

            t_image = t_image.to(device)
            target = target.to(device)

            optimizer.zero_grad()

            with torch.set_grad_enabled(phase == 'train'):
                outputs = model(t_image) 
                outputs = F.softmax(outputs, dim=1)


                loss = criterion(outputs,target)         
                if phase == 'train':
                    loss.backward() 
                    optimizer.step()                           

            _, predicted = torch.max(outputs.data, 1)
            train_loss += loss.item()* t_image.size(0)
            correct_train += (predicted == target).sum().item()

        epoch_loss = train_loss / len(dataLoaders[phase].dataset)
        #losses[phase] = epoch_loss
        losses.append(epoch_loss)

        epoch_acc = 100 * correct_train / len(dataLoaders[phase].dataset) 
        ACCes.append(epoch_acc)

        print('{} Loss: {:.4f} {} Acc: {:.4f}'.format(phase, epoch_loss, phase, epoch_acc))

This is the output for two epochs

Epoch 0/4

train Loss: 2.7026 train Acc: 17.2509 valid Loss: 2.6936 valid Acc: 28.7632 test Loss: 2.6936 test Acc: 28.7632

Epoch 1/4

train Loss: 2.6425 train Acc: 17.8019 valid Loss: 2.6357 valid Acc: 28.7632 test Loss: 2.6355 test Acc: 28.7632

Upvotes: 1

Views: 153

Answers (1)

prosti
prosti

Reputation: 46381

Just a basic tip, it may help you started,

import torchvision.models as models
alexnet = models.alexnet(pretrained=True)

When using alexnet you may start with the pretrained model, I haven't saw that in your code. If you need to have just 15 classes, make sure you remove the fully connected layer at the very end, and add your new fc layer with 15 outputs,

Your alexnet looks like this:

AlexNet(
  (features): Sequential(
    (0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): ReLU(inplace)
    (2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (4): ReLU(inplace)
    (5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
    (6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace)
    (8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (9): ReLU(inplace)
    (10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (11): ReLU(inplace)
    (12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
  (classifier): Sequential(
    (0): Dropout(p=0.5)
    (1): Linear(in_features=9216, out_features=4096, bias=True)
    (2): ReLU(inplace)
    (3): Dropout(p=0.5)
    (4): Linear(in_features=4096, out_features=4096, bias=True)
    (5): ReLU(inplace)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
)

So you are in a need removing just the classifier (6) layer. I think here answered how to remove the fc6.

For multi-label classification, the last layer in the model should use a sigmoid function for label prediction, and the training process should use binary_crossentropy function or nn.BCELoss.

Upvotes: 0

Related Questions