JupyterBoi
JupyterBoi

Reputation: 39

When building a CNN on tensorflow how do I specify which convolutional filter to use?

I know that there are different kinds of Convolutional Filters depending on the job you want to do. ie. Sharpening, blurring, etc. Is there a specific kind we need to use for image classification?

An example CNN used for image classification is provided on the tensorflow website:

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

I realize the convolutional layer is using a 3x3 filter, but how do I know what type of matrix it is? A sharpening one, blurring one, etc? Can any 3X3 be used for image classification?

Upvotes: 1

Views: 728

Answers (2)

ashish_2001
ashish_2001

Reputation: 71

You cannot specify the type of the filter while initializing a TensorFlow/Keras model (meaning whether it'll be a Sobel filter or a Gaussian Blur etc). These weights (filter's values) are learned over time as the training progresses and will be specific for the dataset you use.

But you can specify how to sample the random values of the weights using the kernel_initializer parameter in the Conv2D layer, it can help the model to converge faster or find better local minimum depending upon the data and other factors!

Upvotes: 1

Nicolas Gervais
Nicolas Gervais

Reputation: 36604

It's not a "filter" in the Instagram sense. It's a matrix that slides over the input image, multiplying the corresponding values, then summing up these values. The weights are trainable and so they become features, i.e., they become what extracts the most sense through multiplication.

If you initialize a convolutional layer, it's just random values:

import tensorflow as tf

conv = tf.keras.layers.Conv2D(filters=1, kernel_size=3)
conv.build(input_shape=(None, 28, 28, 1))

weights, biases = conv.weights

print(tf.squeeze(weights))
tf.Tensor(
[[ 0.4269786  -0.07291955  0.31100047]
 [ 0.05227929 -0.19417122 -0.12701556]
 [-0.27614036 -0.12557104 -0.12314937]], shape=(3, 3), dtype=float32)

As I said, the filters are matrices of trainable values, so if your task is to detect lines, you may end up with weights like this:

tf.eye(3)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]], dtype=float32)>

The input of the next image will be a distorted version of the original image, because it was multiplied with random float values.

import tensorflow as tf
from skimage import data
import numpy as np
import matplotlib.pyplot as plt

conv = tf.keras.layers.Conv2D(filters=3, kernel_size=3)
image = data.chelsea()

conv.build(input_shape=(None, *image.shape))

plt.imshow(image)
plt.show()

image = np.expand_dims(image, axis=0).astype(np.float32)
for i in range(5):
    image = conv(image)

plt.imshow(np.squeeze(np.abs(np.minimum(image, 255))).astype(int))
plt.show()

enter image description here enter image description here

The size of the second layer input will also be slightly smaller because the image is reduced by (filter_size - 1), unless there is padding.

Upvotes: 7

Related Questions