Reputation: 3224
The example given here uses two optimizers for encoder
and decoder
individually. Why? And when to do like that?
Upvotes: 1
Views: 3295
Reputation: 5255
If you have multiple networks (in the sense of multiple objects that inherit from nn.Module
), you have to do this for a simple reason: When construction a torch.nn.optim.Optimizer
object, it takes the parameters which should be optimized as an argument. In your case:
encoder_optimizer = optim.Adam(encoder.parameters(), lr=learning_rate)
decoder_optimizer = optim.Adam(decoder.parameters(), lr=learning_rate)
This also gives you the freedom to vary parameters as the learning rate independently. If you don't need that, you could create a new class inheriting from nn.Module
and containing both networks, encoder and decoder or create a set of parameters to give to the optimizer as explained here:
nets = [encoder, decoder]
parameters = set()
for net in nets:
parameters |= set(net.parameters())
where |
is the union operator for sets in this context.
Upvotes: 5