Reputation: 109
I have a CNN Model which has the following architecture:
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(4, 32, (8, 8), 4)
self.conv2 = nn.Conv2d(32, 64, (4, 4), 2)
self.conv3 = nn.Conv2d(64, 64, (3, 3), 1)
self.dense = nn.Linear(4*4*64, 512)
self.out = nn.Linear(512, 18)
I am training it using a certain optimizer. I then want to use these learned parameters from the first model as the initialization scheme for a second model of the exact same architecture (as opposed to using, say, Xavier). I am aware that I need to use model_object.apply(initalization_function), but what would be the most efficient way to do this vis-a-vis the initialization scheme I described where I am using the learned parameters from another model as initialization for a new model?
Upvotes: 1
Views: 565
Reputation: 4037
If you want to load model1
parameters in model2
, I believe this would work:
model2.load_state_dict(model1.state_dict()))
See an example of something similar in the official PyTorch transfer learning tutorial
Upvotes: 1