Can we use 1D convolution for image classification?

I have images with shape (100, 100, 3), and I want to use keras 1D convolution to classify the images. I want to know if this is possible, and what is the shape of the input I need to use.

PS: I use tf.data.Dataset, and my dataset is batched (20, 100, 100, 3).

Upvotes: 2

Answers (2)

ibarrond

Reputation: 7591

Should we use 1D convolution for image classification?

TLDR; Not by itself, but maybe if composed.

The correlation between pixels in an image (be it 2D or 3D due to multiple channels) is of spatial nature: the value of a given pixel is highly influenced by the neighboring pixels both vertically and horizontally. The advantage of 2D/3D Convolution (Conv2D or Conv3D) is that they manage to capture this influence in both spatial directions: vertical and horizontal.

In comparison, 1D convolution or Conv1D is only capturing one of the two correlations (either vertical or horizontal), thus yielding much more limited information. By itself, a singe Conv1D will be leaving out substantial information.

Nonetheless, since a Conv2D could be 'decomposed' into two Conv1D blocks (this is similar to the Pointwise & Depthwise convolutions in the MobileNet architecture), concatenating a Vertical Conv1D and a Horizontal Conv1D captures the spatial correlation in both axes. This is valid approach towards image classification as an alternative to Conv2D.

Can we use 1D convolution for image classification? How?

Yes, we can.

You should not reshape the data to reduce dimensions: if you do, you would be taping together one end of the image (say the top if the Conv1D is applied vertically) with the other end (the say the bottom), which breaks spatial coherence.

This is a possible example on how (implementing the concatenation explained above):

import tensorflow as tf
x = tf.random.normal(input_shape = (20, 100, 100, 3)) # your input batch

# Horizontal Conv1D
y_h = tf.keras.layers.Conv1D(
filters=32, kernel_size=3, activation='relu', input_shape=x.shape[2:])(x)

# Vertical Conv1D
y_v = tf.transpose(x, perm=[0, 2, 1, 3]) # Image rows to columns
y_v = tf.keras.layers.Conv1D(
   filters=32, kernel_size=3, activation='relu', input_shape=x.shape[2:])(x)
# y_v = tf.transpose(y_v, perm=[0, 2, 1, 3]) # Undo transpose, optional

# Concatenate results
y = tf.keras.layers.Concatenate(axis=3)([y_h, y_v]) # Concatenate on the feature_maps

Note that you require multiple operations to obtain a result (convolution over vertical and horizontal axes) which would be easier and faster to get by applying Conv2D directly.

When should we use this?

If your image data is particularly uninformative in one axis while being particularly interesting in the other spatial axis, it might be an idea worth exploring. Otherwise it is better to resort to standard Conv2D (Most of the cases out there, including almost all public image datasets).

Upvotes: 2

Poe Dator

Reputation: 4903

I assume you mean 1x1 convolutions which convolve images across layers. In your case the layer code would be:

tf.keras.layers.Conv2D(filters=NUM_FILTERS, kernel_size=1, strides=1)

Conv1D is indeed for 1d data processing (like sound) as @MatusDubrava pointed out.

Upvotes: 1

Can we use 1D convolution for image classification?

Answers (2)

Should we use 1D convolution for image classification?

Can we use 1D convolution for image classification? How?

When should we use this?

Related Questions