Reputation: 960
I am interested in updating existing layer parameters in Keras (not removing a layer and inserting a new one instead, rather just modifying existing parameters).
I will give an example of a function I'm writing:
def add_filters(self, model):
conv_indices = [i for i, layer in enumerate(model.layers) if 'convolution' in layer.get_config()['name']]
random_conv_index = random.randint(0, len(conv_indices)-1)
factor = 2
conv_layer = model.layers[random_conv_index]
conv_layer.filters = conv_layer.filters * factor
print('new conv layer filters after transform is:', conv_layer.filters)
print('just to make sure, its:', model.layers[random_conv_index].filters)
return model
so what's basically happening here is me taking a random convolutional layer from my network (all my conv layers have 'convolution' in their name) and trying to double the filters. As far as I know this shouldn't cause any 'compilation issues' with input/output size compatibility in any case.
The thing is, my model doesn't change at all. The 2 print-outs I added in the end print the correct number (double the previous amount of filters). But when I compile the model and print model.summary(), I still see the previous filter amount.
BTW, I'm not constricted to Keras. If anyone has an idea how to pull this off with PyTorch for example I'll also buy it :D
Upvotes: 5
Views: 11189
Reputation: 213
Another solution is to again set the attributes of layer. For instance if someone wants to change the kernel initializer of convolutional layers, below is the small example:
img_input = tf.keras.Input(shape=(256,256,1))
x = tf.keras.layers.Conv2D(64, (7, 7), padding='same', use_bias=False, kernel_initializer=None,name='conv')(img_input)
model = tf.keras.Model(inputs=[img_input], outputs=[x], name='resnext')
for layer in model.layers:
print(layer.get_config())
Output:
{'batch_input_shape': (None, 256, 256, 1), 'dtype': 'float32', 'sparse': False, 'name': 'input_1'}
{'name': 'conv2d', 'trainable': True, 'dtype': 'float32', 'filters': 64, 'kernel_size': (7, 7), 'strides': (1, 1), 'padding': 'same', 'data_format': 'channels_last', 'dilation_rate': (1, 1), 'activation': 'linear', 'use_bias': False, 'kernel_initializer': None, 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
after setting:
init1 = tf.keras.initializers.TruncatedNormal()
for layer in model.layers:
if hasattr(layer, 'kernel_initializer'):
setattr(layer, 'kernel_initializer', init1)
for layer in model.layers:
print(layer.get_config())
Output:
{'batch_input_shape': (None, 256, 256, 1), 'dtype': 'float32', 'sparse': False, 'name': 'input_1'}
{'name': 'conv2d', 'trainable': True, 'dtype': 'float32', 'filters': 64, 'kernel_size': (7, 7), 'strides': (1, 1), 'padding': 'same', 'data_format': 'channels_last', 'dilation_rate': (1, 1), 'activation': 'linear', 'use_bias': False, 'kernel_initializer': {'class_name': 'TruncatedNormal', 'config': {'mean': 0.0, 'stddev': 0.05, 'seed': None, 'dtype': 'float32'}}, 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}}, 'kernel_regularizer': None, 'bias_regularizer': None, 'activity_regularizer': None, 'kernel_constraint': None, 'bias_constraint': None}
The kernel initializer has been set
Upvotes: 2
Reputation: 33410
Well, if you would like to create the architecture of a new model based on an existing model, though with some modifications, you can use to_json
and model_from_json()
functions. Here is an example:
model = Sequential()
model.add(Conv2D(10, (3,3), input_shape=(100,100,3)))
model.add(Conv2D(40, (3,3)))
model.summary()
Model summary:
Layer (type) Output Shape Param #
=================================================================
conv2d_12 (Conv2D) (None, 98, 98, 10) 280
_________________________________________________________________
conv2d_13 (Conv2D) (None, 96, 96, 40) 3640
=================================================================
Total params: 3,920
Trainable params: 3,920
Non-trainable params: 0
_________________________________________________________________
Now we modify the number of filters of the first layer and create a new model based on the modified architecture:
from keras.models import model_from_json
model.layers[0].filters *= 2
new_model = model_from_json(model.to_json())
new_model.summary()
New model summary:
Layer (type) Output Shape Param #
=================================================================
conv2d_12 (Conv2D) (None, 98, 98, 20) 560
_________________________________________________________________
conv2d_13 (Conv2D) (None, 96, 96, 40) 7240
=================================================================
Total params: 7,800
Trainable params: 7,800
Non-trainable params: 0
_________________________________________________________________
You can also modify the output of model.to_json()
directly without modifying the model instance.
You can easily use get_weights()
method to get the current weights of the convolution layer. It would return a list of two numpy arrays. The first one corresponds to filter weights and the second one corresponds to bias parameters. Then you can use set_weights()
method to set the new weights:
conv_layer = model.layers[random_conv_index]
weights = conv_layer.get_weights()
weights[0] *= factor # multiply filter weights by `factor`
conv_layer.set_weights(weights)
As a side note, the filters
attribute of a convolution layer which you have used in your code corresponds to the number of filters in this layer and not their weights.
Upvotes: 10