Reputation: 10463
I am looking to train a large face identification network. Resnet or VGG-16/19. TensorFlow 1.14
My question is - if I run out of GPU memory - is it valid strategy to train sets of layers one by one?
For example train 2 cnn and maxpooling layer as one set, then "freeze the weights" somehow and train next set etc..
I know I can train on multi-gpu in tensorflow but what if I want to stick to just one GPU..
Upvotes: 0
Views: 71
Reputation: 1051
I may be wrong but, even if you freeze your weights, they still need to be loaded into the memory (you need to do whole forward pass in order to compute the loss).
Comments on this are appreciated.
Upvotes: 0
Reputation: 2074
The usual approach is to use transfer learning: use a pretrained model and fine-tune it for the task.
For fine-tuning in computer vision, a known approach is re-training only the last couple of layers. See for example:
https://www.learnopencv.com/keras-tutorial-fine-tuning-using-pre-trained-models/
Upvotes: 1