Reputation: 346
I’m trying to train a dataset with AlexNet model. The task is multiclass classification (15 classes). I am wondering why I am getting very low accuracy. I tried different learning rate but has not been improved.
Here is the snippet for the training.
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)
#optimizer = optim.Adam(model.parameters(), lr=1e-2) # 1e-3, 1e-8
def train_valid_model():
num_epochs=5
since = time.time()
out_loss = open("history_loss_AlexNet_exp1.txt", "w")
out_acc = open("history_acc_AlexNet_exp1.txt", "w")
losses=[]
ACCes =[]
#losses = {}
for epoch in range(num_epochs): # loop over the dataset multiple times
print('Epoch {}/{}'.format(epoch, num_epochs - 1))
print('-' * 50)
if epoch % 10 == 9:
torch.save({
'epoch': epoch + 1,
'model_state_dict': model.state_dict(),
'optimizer_state_dict': optimizer.state_dict(),
'loss': loss
}, 'AlexNet_exp1_epoch{}.pth'.format(epoch+1))
for phase in ['train', 'valid', 'test']:
if phase == 'train':
model.train()
else:
model.eval()
train_loss = 0.0
total_train = 0
correct_train = 0
for t_image, target, image_path in dataLoaders[phase]:
#print(t_image.size())
#print(target)
t_image = t_image.to(device)
target = target.to(device)
optimizer.zero_grad()
with torch.set_grad_enabled(phase == 'train'):
outputs = model(t_image)
outputs = F.softmax(outputs, dim=1)
loss = criterion(outputs,target)
if phase == 'train':
loss.backward()
optimizer.step()
_, predicted = torch.max(outputs.data, 1)
train_loss += loss.item()* t_image.size(0)
correct_train += (predicted == target).sum().item()
epoch_loss = train_loss / len(dataLoaders[phase].dataset)
#losses[phase] = epoch_loss
losses.append(epoch_loss)
epoch_acc = 100 * correct_train / len(dataLoaders[phase].dataset)
ACCes.append(epoch_acc)
print('{} Loss: {:.4f} {} Acc: {:.4f}'.format(phase, epoch_loss, phase, epoch_acc))
This is the output for two epochs
train Loss: 2.7026 train Acc: 17.2509 valid Loss: 2.6936 valid Acc: 28.7632 test Loss: 2.6936 test Acc: 28.7632
train Loss: 2.6425 train Acc: 17.8019 valid Loss: 2.6357 valid Acc: 28.7632 test Loss: 2.6355 test Acc: 28.7632
Upvotes: 1
Views: 153
Reputation: 46381
Just a basic tip, it may help you started,
import torchvision.models as models
alexnet = models.alexnet(pretrained=True)
When using alexnet you may start with the pretrained model, I haven't saw that in your code. If you need to have just 15 classes, make sure you remove the fully connected layer at the very end, and add your new fc layer with 15 outputs,
Your alexnet looks like this:
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace)
(3): Dropout(p=0.5)
(4): Linear(in_features=4096, out_features=4096, bias=True)
(5): ReLU(inplace)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
So you are in a need removing just the classifier (6) layer. I think here answered how to remove the fc6.
For multi-label classification, the last layer in the model should use a sigmoid function for label prediction, and the training process should use binary_crossentropy function or nn.BCELoss
.
Upvotes: 0