Reputation: 584

Why do I need to reshape my input data to an additional dimension for Conv2D?

I am new to tensorflow, I have a numpy input data of this format:

x_train_n.shape = (200,64,2048)

meaning 200 is number of training data sets, 64 is H, and 2048 is W

When I want to feed this input to my network, first I have to reshape it:

x_train_n = x_train_n.reshape(x_train_n.shape[0], 1, rows, cols)

Then

inputs = Input(shape=x_train_n.shape[1:])

output1 = Conv2D(32, (3, 15), strides=(1, 2), padding='same', data_format='channels_first', input_shape=x_train_n.shape[1:])(inputs)

otherwise I get the error that Conv2d expected to have 4 dimensions but dimension is 3 error.

Is this the right thing to do? if so, why this makes sense?

Why I can not do following without reshape?

output1 = Conv2D(32, (3, 15), strides=(1, 2), padding='same', data_format='channels_first', input_shape=x_train_n.shape())(inputs)

Upvotes: 2

Answers (3)

Tinu

Reputation: 2533

Conv2D expects 4 dimensions that's right, these are: (BatchSize, Channel, Width, Height).

For colored images you usually have 3 channels RGB for the color intensities, for grayscale images only one.

Upvotes: 2

Ioannis Nasios

Reputation: 8537

Yes this is the right thing to do.

Conv2D layers are applied to 2D data. 2D data may have multiple channels. In your case number of channels equals to 1. Because of this Conv2D is designed to be applied in multiple channels you have to add this extra dimension declaring how many channels your data have (1 channel in your case)

Upvotes: 2

The Guy with The Hat

Reputation: 11132

That makes sense. Convolutional layers require a number of input channels. For an RBG image, that number is 3. For some data (such as greyscale images or apparently whatever data you have), the number of channels is 1. However, that channel still needs to explicitly be there—it can't simply be implied.

Upvotes: 2

Why do I need to reshape my input data to an additional dimension for Conv2D?

Answers (3)

Related Questions