SteC
SteC

Reputation: 891

Sparse training of convolutional layers in Keras

I want to train a CNN in Keras using convolutional layers like

x_1 = Conv2D(16, (kernel_size, kernel_size))(x_in)

x_in consists here 3 input feature layers, thus 3*16 = 48 kernels of kernel_size*kernel_size must be trained here. Suppose that I want that 5 out of these 48 kernels are completely 0 (so all elements of it), how can I train this efficiently?

Thanks in advance.

My total config is like this:

x_in = Input(shape=(None, None, 3))

x_1 = Conv2D(16, (kernel_size, kernel_size))(x_in)
x_1 = ReLU()(x_1)
x_2 = Conv2D(16, (kernel_size, kernel_size))(x_1)
x_2 = ReLU()(x_2)
x_3 = Conv2D(16, (kernel_size, kernel_size))(x_2)
return Model(inputs=x_in, outputs=x_3) 

Upvotes: 2

Views: 955

Answers (1)

Vivek Mehta
Vivek Mehta

Reputation: 2642

In that case you'll have to implement a custom convolution layer. You'll have to make a class which should be instance of keras's Layer class. This requires call method to be implemented for feed-forward calculation.

This might be something you are looking for.

class CustomConv2D(Layer):
    def __init__(self, k=3):
      super(CustomConv2D, self).__init__()

      c1_1 = self.add_weight(shape=(k,k, 1, 5), initializer='zeros', dtype=tf.float32, trainable=False)
      c1_2 = self.add_weight(shape=(k,k, 1, 11), initializer='zeros', dtype=tf.float32, trainable=True)
      self.c1 = tf.concat([c1_1,c1_2], axis=-1)

      self.c2 = self.add_weight(shape=(k,k, 1, 16), initializer='zeros', dtype=tf.float32, trainable=True)
      self.c3 = self.add_weight(shape=(k,k, 1, 16), initializer='zeros', dtype=tf.float32, trainable=True)

    def call(self, inputs):
        x_1_c1 = tf.nn.conv2d(tf.expand_dims(inputs[:,:,:,0],-1), self.c1,padding='VALID')
        x_1_c2 = tf.nn.conv2d(tf.expand_dims(inputs[:,:,:,1],-1), self.c2,padding='VALID')
        x_1_c3 = tf.nn.conv2d(tf.expand_dims(inputs[:,:,:,2],-1), self.c3,padding='VALID')

        x_1 = tf.concat([x_1_c1,x_1_c2, x_1_c3], -1)
        return x_1

In this case we have three set of filters (16 for each channel), for first channel we are keeping 5 filters as non-trainable and rest 11 as trainable and rest of the 32 (for channel-2 and 3) filters are trainable.

This is instance of Layer class in keras and can be used just like any normal layer.

model = tf.keras.models.Sequential()
model.add(CustomConv2D(3))

model.build(input_shape=(None,None,None,3))
I = tf.keras.Input((None,None,3))
model.call(I)
model.summary()
'''
Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
custom_conv2d_2 (CustomConv2 (None, None, None, 48)    432       
=================================================================
Total params: 432
Trainable params: 387
Non-trainable params: 45
_________________________________________________________________
'''


As you can see we are not training 5 filters and hence 45 (3x3x5) non-trainable parameters

Here I have not added bias term. Which you can add further and customize your Conv layer. Also order of filters can be changed just make trainable parameter to False, and change initializer to something else for layers that you want to train.

Upvotes: 2

Related Questions