xjl_peter
xjl_peter

Reputation: 63

About input_shape in keras.layers from tensorflow

I am a beginner for tensorflow. I had just tried to fit a simple LeNet-5 for mnist data.

My training and test data are first in Numpy format. i.e., (60000, 28, 28). Then I set my model as below.

model_LeNet5 = Sequential([ 
    layers.Conv2D(6, kernel_size=3, strides=1, input_shape=(28, 28, 1)),
    layers.MaxPooling2D(pool_size=2,strides=2), 
    layers.ReLU(), 
    layers.Conv2D(16,kernel_size=3,strides=1), 
    layers.MaxPooling2D(pool_size=2,strides=2), 
    layers.ReLU(), 
    layers.Flatten(), 
    layers.Dense(120, activation='relu'), 
    layers.Dense(84, activation='relu'), 
    layers.Dense(10) 
])

I could understand that I get success when I set input_shape as (28,28) or train_images.shape[1:], but I can not understand that input_shape = (28,28,1) is also worked (shown as code above).

It seems that there is an inconsistancy between the shape of data and setting of input size (i.e., [60000,28,28] vs [28,28,1]). Also the broadcast rule may not link [60000,28,28] with [28,28,1]. Thanks for anyone who will explain the mechanism of input_shape.

Upvotes: 2

Views: 264

Answers (1)

Amirhossein Rezaei
Amirhossein Rezaei

Reputation: 364

A single grayscale image can be represented using a two-dimensional (2D) NumPy array or a tensor. Since there is only one channel in a grayscale image, we don’t need an extra dimension to represent the color channel. The two dimensions represent the height and width of the image. enter image description here A batch of 3 grayscale images can be represented using a three-dimensional (3D) NumPy array or a tensor. Here, we need an extra dimension to represent the number of images.

enter image description here

For more information, check out this article on towardsdatascience.

Upvotes: 1

Related Questions