Reputation: 301
Please add a minimum comment on your thought, so that I can improve my query. Thanks. -)
I am working on the MNIST
dataset and write some CNN
code. However, I am confused about some of the points with the CNN
code. How to know the number of layers in a neural network? With my current understanding, I think this has 6 layers with 4 hidden layers. Is that right? and what if I need to extend to 10 layers? how to do it?
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D
model = Sequential()
model.add(Conv2D(28, kernel_size=(3,3),
input_shape = ...))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dropout(0.2))
model.add(Dense(10, activation=tf.nn.softmax))
Upvotes: 1
Views: 8556
Reputation: 123
While counting the number of layers in a Neural Network we usually only count convolutional layers and fully connected layers. Pooling Layer is taken together with the Convolutional Layer and counted as one layer and Dropout is a regularization technique so it will also not be counted as a separate layer.
For reference, the VGG16 mode is defined as a 16 layer model. Those 16 layers are only the Convolutional Layers and Fully Connected Dense Layers. If you count all the pooling and activation layers it will change to a 41 layer model, which it is not. Reference: VGG16, VGG16 Paper
So as per your code you have 3 layers (1 Convolutional Layer with 28 Neurons, 1 Fully Connected Layer with 128 Neurons and 1 Fully Connected Layer with 10 neurons)
As for making it a 10 layer network you can add more convolutional layers or dense layers before the output layers, but it won't be necessary for the MNIST dataset.
Upvotes: 5
Reputation: 17219
If you print .summary()
of your model you would get
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 26, 26, 28) 280
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 28) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 4732) 0
_________________________________________________________________
dense (Dense) (None, 128) 605824
_________________________________________________________________
dropout (Dropout) (None, 128) 0
_________________________________________________________________
dense_1 (Dense) (None, 10) 1290
=================================================================
Total params: 607,394
Trainable params: 607,394
Non-trainable params: 0
print(len(model.layers)) # 6
As you can see you built a deep neural network of 6 layers - some of them are trainable
layers and some of them are non-trainable
layers. So, if anyone asks you about the number of your model's layers, it would be simply 6.
And, how to extend or add more layers into this. Well, that's very simple like filling water in the glass. Like this
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Conv2D, Dropout,
Flatten, MaxPooling2D, BatchNormalization)
model = Sequential()
model.add(Conv2D(16, kernel_size=(3,3),
input_shape = (28,28,1)))
model.add(Conv2D(32, kernel_size=(3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(10, activation=tf.nn.softmax))
model.summary()
print(len(model.layers)) # 10
Now, take note of this, though using tf. keras
(or another framework like pytorch
) we can so easily do such things, but we should consider what we are doing and what for. I won't describe much on this because it's out of the scope of this question. But I would highly recommend you to check tf. keras
official code example.
The term Hidden Layer is just a naming convention that was used early days frequently (AFAIK), mainly in a Fully Connected Layer (before CNN
). That's why for simplicity, I would say just forget about this term. We should refer to a layer as trainable
and non-trainable
, that makes more sense now.
In your model, 1st CNN
(trainable layer), 2nd MaxPool2D
(non-trainable), 3rd Flatten
(non-trainable), 4thDense
(trainable), 5th Dropout
(non-trainable), and lastly at 6th Dense
(trainable). You can also see the Param #
column in model. summary()
, those who are non-trainable layer, their parameter is zero - no trainable variable in that layer. Let's say, in your model, the first layer stated as,
model.add(Conv2D(28, kernel_size=(3,3),
input_shape = ...))
whatever the expected input_shape
would be, pass the 3 x 3
size filters (total 28
) on the input and do convolution and produce some feature maps. So, at the end of this layer, we would get a total of 28
feature maps. Next layer
model.add(MaxPooling2D(pool_size=(2,2)))
which is simply pooling the maximum value from those 28
feature maps, nothing else. So no computational operation - that's why so non-trainable parameter.
Hidden Layer simply refers to that layer that is placed between the Input layer and Output layer in a deep neural network. But in your model, the first layer which is Conv2D
is a hidden layer, it's not the input layer. Here, the input layer is implicitly present when we pass the input_shape
argument to the first Conv2D
layer. So, if we take the Hidden Layer naming convention concretely, we can say, in your model, there is 5 hidden layer (from first Conv2D
to Dropout
). And The input layer is implicitly present and the output layer is the last Dense
layer.
Upvotes: 4