Reputation: 894
Taking a pre-trained model in Keras and replacing the top classification layer to retrain the network to a new task has several examples using a Sequential model in Keras. A sequential model has methods model.pop()
and model.add()
which make this fairly easy.
However, how is this achieved when using a functional model? This framework does not have method model.add()
.
How can I load a pretrained functional model in Keras, crop the last layer and replace it with a new one?
Current approach so far:
model.load_weights('/my_model_weights.h5')
def pop_layer(model):
if not model.outputs:
raise Exception('Sequential model cannot be popped: model is empty.')
model.layers.pop()
if not model.layers:
model.outputs = []
model.inbound_nodes = []
model.outbound_nodes = []
else:
model.layers[-1].outbound_nodes = []
model.outputs = [model.layers[-1].output]
model.built = False
# Remove last layer with custom function (from another post)
pop_layer(model)
# Now add a new layer to the model ???
model.add(Dense(2, activation='softmax', name='fc2'))
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd',
metrics=['accuracy'])
AttributeError: 'Model' object has no attribute 'add'
Upvotes: 2
Views: 1229
Reputation: 6176
You can use a pretrained functional model with the last layer removed as a layer. You may think of a model as a "bigger layer". Then redefine a new model that wraps "bigger layer" and a new layer.
An example:
import tensorflow as tf
from keras.layers import Dense,Input
from keras.models import Sequential,Model
input_tensor = Input(shape=(64,))
x = Dense(32, activation='relu')(input_tensor)
x = Dense(32, activation='relu')(x)
output_tensor = Dense(10, activation=tf.nn.softmax)(x)
model = Model(input_tensor, output_tensor)
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd',
metrics=['accuracy'])
print(model.summary())
model.save_weights('my_model_weights.h5')
#
model.load_weights('my_model_weights.h5')
def pop_layer(model):
if not model.outputs:
raise Exception('Sequential model cannot be popped: model is empty.')
model.layers.pop()
if not model.layers:
model.outputs = []
model.inbound_nodes = []
model.outbound_nodes = []
else:
model.layers[-1].outbound_nodes = []
model.outputs = [model.layers[-1].output]
return model
# Remove last layer with custom function (from another post)
model_old = pop_layer(model)
# Now add a new layer to the model
model_new = Sequential()
model_new.add(model_old)
model_new.add(Dense(2, activation=tf.nn.softmax, name='fc2'))
model_new.compile(loss='sparse_categorical_crossentropy', optimizer='sgd',
metrics=['accuracy'])
print(model_new.summary())
As a result, you can see that the parameters of the last layer of pretrained functional model are missing.
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 64) 0
_________________________________________________________________
dense_1 (Dense) (None, 32) 2080
_________________________________________________________________
dense_2 (Dense) (None, 32) 1056
_________________________________________________________________
dense_3 (Dense) (None, 10) 330
=================================================================
Total params: 3,466
Trainable params: 3,466
Non-trainable params: 0
_________________________________________________________________
None
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
model_1 (Model) multiple 3136
_________________________________________________________________
fc2 (Dense) (None, 2) 66
=================================================================
Total params: 3,202
Trainable params: 3,202
Non-trainable params: 0
_________________________________________________________________
None
Upvotes: 2