Antony Joy
Antony Joy

Reputation: 301

How to check the number of layers in a neural network in python and when should we increase the layers?

Please add a minimum comment on your thought, so that I can improve my query. Thanks. -)


I am working on the MNIST dataset and write some CNN code. However, I am confused about some of the points with the CNN code. How to know the number of layers in a neural network? With my current understanding, I think this has 6 layers with 4 hidden layers. Is that right? and what if I need to extend to 10 layers? how to do it?

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D

model = Sequential()
model.add(Conv2D(28, kernel_size=(3,3), 
                    input_shape = ...))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dropout(0.2))
model.add(Dense(10, activation=tf.nn.softmax))

Upvotes: 1

Views: 8556

Answers (2)

Pradyut
Pradyut

Reputation: 123

While counting the number of layers in a Neural Network we usually only count convolutional layers and fully connected layers. Pooling Layer is taken together with the Convolutional Layer and counted as one layer and Dropout is a regularization technique so it will also not be counted as a separate layer.

For reference, the VGG16 mode is defined as a 16 layer model. Those 16 layers are only the Convolutional Layers and Fully Connected Dense Layers. If you count all the pooling and activation layers it will change to a 41 layer model, which it is not. Reference: VGG16, VGG16 Paper

So as per your code you have 3 layers (1 Convolutional Layer with 28 Neurons, 1 Fully Connected Layer with 128 Neurons and 1 Fully Connected Layer with 10 neurons)

As for making it a 10 layer network you can add more convolutional layers or dense layers before the output layers, but it won't be necessary for the MNIST dataset.

Upvotes: 5

Innat
Innat

Reputation: 17219

If you print .summary() of your model you would get

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 26, 26, 28)        280       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 28)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 4732)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               605824    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 607,394
Trainable params: 607,394
Non-trainable params: 0

print(len(model.layers)) # 6

As you can see you built a deep neural network of 6 layers - some of them are trainable layers and some of them are non-trainable layers. So, if anyone asks you about the number of your model's layers, it would be simply 6.


And, how to extend or add more layers into this. Well, that's very simple like filling water in the glass. Like this

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Conv2D, Dropout,
                                     Flatten, MaxPooling2D, BatchNormalization)

model = Sequential()
model.add(Conv2D(16, kernel_size=(3,3), 
                    input_shape = (28,28,1)))
model.add(Conv2D(32, kernel_size=(3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, kernel_size=(3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(BatchNormalization())
model.add(Dropout(0.2))
model.add(Dense(10, activation=tf.nn.softmax))

model.summary()
print(len(model.layers)) # 10

Now, take note of this, though using tf. keras (or another framework like pytorch) we can so easily do such things, but we should consider what we are doing and what for. I won't describe much on this because it's out of the scope of this question. But I would highly recommend you to check tf. keras official code example.


The term Hidden Layer is just a naming convention that was used early days frequently (AFAIK), mainly in a Fully Connected Layer (before CNN). That's why for simplicity, I would say just forget about this term. We should refer to a layer as trainable and non-trainable, that makes more sense now.

In your model, 1st CNN (trainable layer), 2nd MaxPool2D (non-trainable), 3rd Flatten (non-trainable), 4thDense (trainable), 5th Dropout (non-trainable), and lastly at 6th Dense (trainable). You can also see the Param # column in model. summary(), those who are non-trainable layer, their parameter is zero - no trainable variable in that layer. Let's say, in your model, the first layer stated as,

model.add(Conv2D(28, kernel_size=(3,3), 
                    input_shape = ...))

whatever the expected input_shape would be, pass the 3 x 3 size filters (total 28) on the input and do convolution and produce some feature maps. So, at the end of this layer, we would get a total of 28 feature maps. Next layer

model.add(MaxPooling2D(pool_size=(2,2)))

which is simply pooling the maximum value from those 28 feature maps, nothing else. So no computational operation - that's why so non-trainable parameter.


Hidden Layer simply refers to that layer that is placed between the Input layer and Output layer in a deep neural network. But in your model, the first layer which is Conv2D is a hidden layer, it's not the input layer. Here, the input layer is implicitly present when we pass the input_shape argument to the first Conv2D layer. So, if we take the Hidden Layer naming convention concretely, we can say, in your model, there is 5 hidden layer (from first Conv2D to Dropout). And The input layer is implicitly present and the output layer is the last Dense layer.

Upvotes: 4

Related Questions