Reputation: 584
I am new to tensorflow, I have a numpy input data of this format:
x_train_n.shape = (200,64,2048)
meaning 200 is number of training data sets, 64 is H, and 2048 is W
When I want to feed this input to my network, first I have to reshape it:
x_train_n = x_train_n.reshape(x_train_n.shape[0], 1, rows, cols)
Then
inputs = Input(shape=x_train_n.shape[1:])
output1 = Conv2D(32, (3, 15), strides=(1, 2), padding='same', data_format='channels_first', input_shape=x_train_n.shape[1:])(inputs)
otherwise I get the error that Conv2d expected to have 4 dimensions but dimension is 3 error.
Is this the right thing to do? if so, why this makes sense?
Why I can not do following without reshape?
output1 = Conv2D(32, (3, 15), strides=(1, 2), padding='same', data_format='channels_first', input_shape=x_train_n.shape())(inputs)
Upvotes: 2
Views: 1196
Reputation: 2533
Conv2D
expects 4 dimensions that's right, these are: (BatchSize, Channel, Width, Height)
.
For colored images you usually have 3 channels RGB for the color intensities, for grayscale images only one.
Upvotes: 2
Reputation: 8537
Yes this is the right thing to do.
Conv2D
layers are applied to 2D data. 2D data may have multiple channels. In your case number of channels equals to 1. Because of this Conv2D
is designed to be applied in multiple channels you have to add this extra dimension declaring how many channels your data have (1 channel in your case)
Upvotes: 2
Reputation: 11132
That makes sense. Convolutional layers require a number of input channels. For an RBG image, that number is 3. For some data (such as greyscale images or apparently whatever data you have), the number of channels is 1. However, that channel still needs to explicitly be there—it can't simply be implied.
Upvotes: 2