Reputation: 339
I have a custom model trained initially on VGG16 using transfer learning. However, it was initially trained on images with a smaller input size. Now, I am using images with bigger sizes, therefore I'd like to grab the first model and take advantage of what it has learned but now with new dataset.
More specifically:
Layer (type) Output Shape Param #
=================================================================
block1_conv1 (Conv2D) (None, 128, 160, 64) 1792
block1_conv2 (Conv2D) (None, 128, 160, 64) 36928
block1_pool (MaxPooling2D) (None, 64, 80, 64) 0
block2_conv1 (Conv2D) (None, 64, 80, 128) 73856
block2_conv2 (Conv2D) (None, 64, 80, 128) 147584
block2_pool (MaxPooling2D) (None, 32, 40, 128) 0
block3_conv1 (Conv2D) (None, 32, 40, 256) 295168
block3_conv2 (Conv2D) (None, 32, 40, 256) 590080
block3_conv3 (Conv2D) (None, 32, 40, 256) 590080
block3_pool (MaxPooling2D) (None, 16, 20, 256) 0
block4_conv1 (Conv2D) (None, 16, 20, 512) 1180160
block4_conv2 (Conv2D) (None, 16, 20, 512) 2359808
block4_conv3 (Conv2D) (None, 16, 20, 512) 2359808
block4_pool (MaxPooling2D) (None, 8, 10, 512) 0
block5_conv1 (Conv2D) (None, 8, 10, 512) 2359808
block5_conv2 (Conv2D) (None, 8, 10, 512) 2359808
block5_conv3 (Conv2D) (None, 8, 10, 512) 2359808
block5_pool (MaxPooling2D) (None, 4, 5, 512) 0
flatten (Flatten) (None, 10240) 0
dense (Dense) (None, 16) 163856
output (Dense) (None, 1) 17
The problem is that this model already includes an input layer of 128x160, and I'd like to change it to 384x288 for transfer learning.
The above is my first model, I now would like to do transfer learning again but with a different dataset that has an input of size 384x288 and I'd like to use a softmax for two classes instead.
So, what i want to do is a transfer learning from the custom model on a different dataset, So I need to change the input size and retrain the new model with my own data
How can I do a transfer learning on the model above but with a new dataset and different classification layer in the output?
Upvotes: 3
Views: 2415
Reputation: 339
Found a very simple solution to my problem and now I am able to train it with different data and diferent classification layers:
from keras.models import load_model
from keras.models import Model
from keras.models import Sequential
old_model = load_model("/content/drive/MyDrive/old_model.h5")
old_model = Model(old_model.input, old_model.layers[-4].output) # Remove the classification, dense and flatten layers
base_model = Sequential() # Create a new model from the 2nd layer and all the convolutional blocks
for layer in old_model.layers[1:]:
base_model.add(layer)
for layer_number, layer in enumerate(base_model.layers):
print(layer_number, layer.name, layer.trainable)
# Perform transfer learning
model = tf.keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(384, 288, 3)),
base_model,
tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.GlobalAveragePooling2D(),
tf.keras.layers.Dense(units=2, activation='softmax')
])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
Upvotes: 0
Reputation: 16916
There are many possible solutions for it.
As suggested by many and a very simple solution:
However, in the above approach you are not able to take advantage of higher resolution images you have.
Another approach is to use the pretrained model just as feature extractor and train a seperate model on high resolution images. Finally use the features from both the pretrained model as well as your trained model to do the final predictions. The high level idea is as below:
import numpy as np
import tensorflow as tf
from tensorflow import keras
low_res_image_size = (150, 150, 3)
hig_res_image_size = (320, 240, 3)
n_classes = 4
# Load your pretrained model train on low resolution images
base_model = tf.keras.applications.VGG16(
include_top=False, weights='imagenet', input_shape=low_res_image_size)
# Freeze the pretrained model
base_model.trainable = False
# Unfreezed model to be trained on high resolution images
model = tf.keras.applications.VGG19(
include_top=False, weights='imagenet', input_shape=hig_res_image_size)
model.trainable = True
# Downscale images
downscale_layer = tf.keras.layers.Resizing(
low_res_image_size[0], low_res_image_size[1],
interpolation='bilinear', crop_to_aspect_ratio=False)
# Create model
inputs = keras.Input(shape=hig_res_image_size)
downscaled_inputs = downscale_layer(inputs)
features = base_model(downscaled_inputs, training=False)
features = keras.layers.GlobalAveragePooling2D()(features)
x = model(inputs, training=True)
x = keras.layers.GlobalAveragePooling2D()(x)
concatted = tf.keras.layers.Concatenate()([features, x])
outputs = keras.layers.Dense(n_classes)(concatted)
model = keras.Model(inputs, outputs)
model.compile(optimizer="adam", loss='sparse_categorical_crossentropy')
# Train on some random data
model.fit(
np.random.random((100,*hig_res_image_size)),
np.random.randint(0, n_classes, 100), epochs=3)
Output:
Epoch 1/3
4/4 [==============================] - 4s 553ms/step - loss: 8.7033
Epoch 2/3
4/4 [==============================] - 2s 554ms/step - loss: 9.0746
Epoch 3/3
4/4 [==============================] - 2s 553ms/step - loss: 9.0746
<keras.callbacks.History at 0x7f559a104650>
As and added step, after the model converges you and also unfreeze all the layers and train the full model again using a very low learning rate. Just keep an eye on overfitting.
Upvotes: 1
Reputation: 3074
You can follow these steps:
for new_layer, layer in zip(new_model.layers[0:-4], model.layers[0:-4]):
new_layer.set_weights(layer.get_weights())
new_layer.trainable = False
Further reading:
Upvotes: 2
Reputation: 59
Copy your model to the another model (transfer learning), and then update the new model in the way you want to use it. Change input size, change activation functions, whatever you wanna do.
Upvotes: -1