How to update classification layers without changing weights for convolutional layers

Question

I have a CNN with numerous convolutional layers. To each of these convolutional layers I have attached a classifier, to check outputs of intermediary layers. After losses have been produced for each of these classifiers, I want to update the weights for the classifier without touching the weights for the convolutional layers. This code:

for i in range(len(loss_per_layer)):
    loss_per_layer[i].backward(retain_graph=True)
    self.classifiers[i].weight.data -= self.learning_rate * self.alpha[i] * self.classifiers[i].weight.grad.data
    self.classifiers[i].bias.data -= self.learning_rate * self.alpha[i] * self.classifiers[i].bias.grad.data

allows me to do so if the classifier consists of a singular nn.Linear layer. However, my classifiers are of the shape:

self.classifiers.append(nn.Sequential(
    nn.Linear(int(feature_map * input_shape[1] * input_shape[2]), 100),
    nn.ReLU(True),
    nn.Dropout(),
    nn.Linear(100, self.num_classes),
    nn.Sigmoid(),
    ))

How can I update the weights of the Sequential block without touching the rest of the network? I have recently changed from keras to pytorch and so am unsure on how exactly to utilize the optimizer.step() function for this situation, but I have a suspicion it can be done using that.

Please note, I need a generic solution for a Sequential block of any shape, as it will change in future iterations of the model.

Any help is much appreciated.

Umang Gupta · Accepted Answer

You can implement your model as below:

class Model(nn.Module):
       def __init__(self, conv_layers, classifier):
           super().__init__()
           self.conv_layers = conv_layers
           self.classifier = classifier

       def forward(self,x):
           x = self.conv_layers(x)
           return self.classifier(x)

When declaring optimizer, only pass the parameters that you want to be updated.

       model = Model(conv_layers, classifier)            
       optimizer = torch.optim.Adam(model.classifier.parameters(), lr=lr)

Now when you will do

       loss.backward()
       optimizer.step()
       model.zero_grad()

only classifier params will be updated.

EDIT: After OP's comment, I am adding below for more generic use cases.

A more generic scenario

   class Model(nn.Module):
       def __init__(self, modules):
           super().__init__()
           # supposing you have multiple modules declared like below. 
           # You can also keep them as an array or dict too. 
           # For this see nn.ModuleList or nn.ModuleDict in pytorch 
           self.module0 = modules[0]
           self.module1 = modules[1]
           #..... and so on

       def forward(self,x):
           # implement forward 

   # model and optimizer declarations
   model = Model(modules)  
   # assuming we want to update module0 and module1          
   optimizer = torch.optim.Adam([
       model.module0.parameters(), 
       model.module1.parameters()
   ], lr=lr)
   # you can also provide different learning rate for different modules. 
   # See [documentation][https://pytorch.org/docs/stable/optim.html]
   
   # when training  
   loss.backward()
   optimizer.step()
   model.zero_grad()
   # use model.zero_grad to remove gradient computed for all the modules.
   # optimizer.zero_grad only removes gradient for parameters that were passed to it.

How to update classification layers without changing weights for convolutional layers

Answers (2)

Related Questions