Reputation: 39
I know that there are different kinds of Convolutional Filters depending on the job you want to do. ie. Sharpening, blurring, etc. Is there a specific kind we need to use for image classification?
An example CNN used for image classification is provided on the tensorflow website:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
I realize the convolutional layer is using a 3x3 filter, but how do I know what type of matrix it is? A sharpening one, blurring one, etc? Can any 3X3 be used for image classification?
Upvotes: 1
Views: 728
Reputation: 71
You cannot specify the type of the filter while initializing a TensorFlow/Keras model (meaning whether it'll be a Sobel filter or a Gaussian Blur etc). These weights (filter's values) are learned over time as the training progresses and will be specific for the dataset you use.
But you can specify how to sample the random values of the weights using the kernel_initializer
parameter in the Conv2D
layer, it can help the model to converge faster or find better local minimum depending upon the data and other factors!
Upvotes: 1
Reputation: 36604
It's not a "filter" in the Instagram sense. It's a matrix that slides over the input image, multiplying the corresponding values, then summing up these values. The weights are trainable and so they become features, i.e., they become what extracts the most sense through multiplication.
If you initialize a convolutional layer, it's just random values:
import tensorflow as tf
conv = tf.keras.layers.Conv2D(filters=1, kernel_size=3)
conv.build(input_shape=(None, 28, 28, 1))
weights, biases = conv.weights
print(tf.squeeze(weights))
tf.Tensor(
[[ 0.4269786 -0.07291955 0.31100047]
[ 0.05227929 -0.19417122 -0.12701556]
[-0.27614036 -0.12557104 -0.12314937]], shape=(3, 3), dtype=float32)
As I said, the filters are matrices of trainable values, so if your task is to detect lines, you may end up with weights like this:
tf.eye(3)
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]], dtype=float32)>
The input of the next image will be a distorted version of the original image, because it was multiplied with random float values.
import tensorflow as tf
from skimage import data
import numpy as np
import matplotlib.pyplot as plt
conv = tf.keras.layers.Conv2D(filters=3, kernel_size=3)
image = data.chelsea()
conv.build(input_shape=(None, *image.shape))
plt.imshow(image)
plt.show()
image = np.expand_dims(image, axis=0).astype(np.float32)
for i in range(5):
image = conv(image)
plt.imshow(np.squeeze(np.abs(np.minimum(image, 255))).astype(int))
plt.show()
The size of the second layer input will also be slightly smaller because the image is reduced by (filter_size - 1)
, unless there is padding.
Upvotes: 7