Reputation: 12266
The reason why I am asking this question is due to the fact that in VGG19 there are batch-normalization layers (unlike VGG16, for example).
I am trying to train a Faster-RCNN network in Caffe. I am doing it by:
I did not change anything regarding the lr_mult
values of the convolutional layers. In the prototxt file, the convolutional layers (like conv1_1
, etc. have non-zero lr_mult
values, while the batch normalization layers' lr_mult
values are set to 0 (layers named like conv1_1/bn
).
Does the fact that the batch normalization layers are frozen means that the convolutional layers are frozen as well? Or should I set lr_mult
to 0 also in the layers named convX_X
?
Update: After running another training process while zeroing the lr_mult
of all the convolutional layers, the training time reduced dramatically, which implies that the answer is that the lr_mult
value needs to be set to 0 also in the convX_X
layers.
Upvotes: 1
Views: 1120
Reputation: 5084
To properly freeze convolutional layers with batchnorm in Caffe, you should:
lr_mult
param to 0lr_mult
params to 0, set use_global_stats
to true
:layer {
name: "bn1"
type: "BatchNorm"
bottom: "pool1"
top: "bn1"
batch_norm_param {
use_global_stats: true
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
param {
lr_mult: 0
}
}
Does the fact that the batch normalization layers are frozen means that the convolutional layers are frozen as well?
Of course not. However, by using propagate_down
param you can achieve this effect: How do I to prevent backward computation in specific layers in caffe.
Upvotes: 2