How to define per-layer learning rate in mxnet.gluon?

Question

I know that it is possible to freeze layers in a network for example to train only the last layers of a pre-trained model. However, I want to know is there any way to apply certain learning rates to different layers. For example, in pytorch it would be:

    optimizer = torch.optim.Adam([
                    {'params': paras['conv1'], 'lr': learning_rate / 10},
                    {'params': paras['middle'], 'lr': learning_rate / 3},
                    {'params': paras['fc'], 'lr': learning_rate }
                 ], lr=learning_rate)

Interfaces of gluon and torch are pretty much the same. Any idea how I can do this in gluon?

How to define per-layer learning rate in mxnet.gluon?

Answers (1)

Related Questions