How do I copy specific layer weights from pretrained models using Tensorflow Keras api?

Question

I'm trying to train a conv net which takes a 4 channel input, and want to use a pretrained model like VGG16. It makes sense that I should not use initial conv blocks from VGG16 since they're trained for 3 channel inputs, and redefine the initial conv blocks.

However, I want to use block3 onwards from VGG16. How do I achieve this using Tensorflow Keras api?

In short, how do I copy weights from specific layers from pretrained models. I'm using tensorflow 2.0 alpha version.

Gabriel Ibagon · Accepted Answer

A quick way to do this is to make a new model that combines your custom input and the last layers of VGG16. Find the index of the first VGG16 layer you'd like to keep, and connect it to your newly created input. Then connect each following VGG16 layer manually to recreate the VGG16 segment. You can freeze the VGG16 layers along the way.

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv2D

vgg16 = VGG16()

# Find the index of the first block3 layer
for index in range(len(vgg16.layers)):
    if 'block3' in vgg16.layers[index].name:
        break

# Add your own input
model_input = Input(shape=(224,224,4), name='new_input')
x = Conv2D(...)(model_input)
...

# Connect your last layer to the VGG16 model, starting at the "block3" layer
# Then, you need to connect every layer manually in a for-loop, freezing each layer along the way

for i in range(index, len(vgg16.layers)):
  # freeze the VGG16 layer
  vgg16.layers[i].trainable = False  

  # connect the layer
  x = vgg16.layers[i](x)

model_output = x
newModel = Model(model_input, model_output)

Also make sure that the output of your custom layers matches the shape that the block3 layers are expecting as input.

How do I copy specific layer weights from pretrained models using Tensorflow Keras api?

Answers (1)

Related Questions