Reputation: 3
I am trying to do transfer learning on Pytorch pretrained models with custom dataset. I have been able to successfully perform transfer learning with SqueezeNet.
For Squeezenet my classifier was, layers source
model.classifier = nn.Sequential(
nn.Dropout(p=0.2),
nn.Conv2d(512, len(class_names), kernel_size=1),
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d((1, 1)))
For Efficientnet my classifier was, layers source
model.classifier = torch.nn.Sequential(
torch.nn.Dropout(p=0.2, inplace=True),
torch.nn.Linear(in_features=1280,
out_features=output_shape,
bias=True))
Similarly I have been trying to do for MaxViT, I went through the source and saw that there are block_channels[-1]
in parameter. I have recently started with this, and I don't know what they are, layers source
self.classifier = nn.Sequential(
nn.AdaptiveAvgPool2d(1),
nn.Flatten(),
nn.LayerNorm(block_channels[-1]),
nn.Linear(block_channels[-1], block_channels[-1]),
nn.Tanh(),
nn.Linear(block_channels[-1], num_classes, bias=False),
)
For reference, if needed, following is my complete code for performing transfer learning using squeezenet.
weights = torchvision.models.SqueezeNet1_0_Weights.DEFAULT
model = torchvision.models.squeezenet1_0(weights=weights).to(device)
auto_transforms = weights.transforms()
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=d1,
test_dir=d2,
transform=auto_transforms,
batch_size=32)
for param in model.features.parameters():
param.requires_grad = False
torch.manual_seed(42)
torch.cuda.manual_seed(42)
output_shape = len(class_names)
model.classifier = nn.Sequential(
nn.Dropout(p=0.2),
nn.Conv2d(512, len(class_names), kernel_size=1),
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d((1, 1))).to(device)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
torch.manual_seed(42)
torch.cuda.manual_seed(42)
results = engine.train(model=model,
train_dataloader=train_dataloader,
test_dataloader=test_dataloader,
optimizer=optimizer,
loss_fn=loss_fn,
epochs=15,
device=device)
What should my classifier be for MaxViT?
Upvotes: 0
Views: 32
Reputation: 520
From MaxVit
Args parameters:
block_channels (List[int]): Number of channels in each block. Source
The classifier Source
self.classifier = nn.Sequential(
nn.AdaptiveAvgPool2d(1),
nn.Flatten(),
nn.LayerNorm(block_channels[-1]),
nn.Linear(block_channels[-1], block_channels[-1]),
nn.Tanh(),
nn.Linear(block_channels[-1], num_classes, bias=False),
)
Since block_channels
is a list,`block_channels[-1] returns the last item in the list, 512 in the following case Source
return _maxvit(
stem_channels=64,
block_channels=[64, 128, 256, 512],
block_layers=[2, 2, 5, 2],
head_dim=32,
stochastic_depth_prob=0.2,
partition_size=7,
weights=weights,
progress=progress,
**kwargs,
)
Upvotes: 0