Several FC layers in a row

Question

I have a question about the role of the Fully Connected layer in the last layers of a CNN.

1- Is FC layer acts as a learner classifier?

2- Why we first use a linear activation function followed by a non-linear (e.g. softmax)?

3- What is the reason for adding several FC layers in a row on top of the network?

M_L = KL.Dense(512, activation='relu')(M_L)
M_L = KL.Dropout(DROPOUT_PROB)(M_L)
M_L = KL.Dense(256, activation='relu')(M_L)
M_L = KL.Dropout(DROPOUT_PROB)(M_L)
M_L = KL.Dense(128, activation='relu')(M_L)
M_L = KL.Dropout(DROPOUT_PROB)(M_L)
M_L = KL.Dense(64, activation='relu')(M_L)
M_L = KL.Dropout(DROPOUT_PROB)(M_L)
M_L = KL.Dense(1, activation='Sigmoid')(M_L)

4- What would be the difference if we only do like this:

M_L = KL.Dense(512, activation='relu')(M_L)
M_L = KL.Dropout(DROPOUT_PROB)(M_L)
M_L = KL.Dense(1, activation='Sigmoid')(M_L)

Or even:

M_L = KL.Dense(1, activation='Sigmoid')(M_L)

My intuition is that by adding more FC layers we have more trainable parameters. So, it will help to a multi-task network to have some specific parameters for a specific task. Am I right?

5- Do we have any other reason for adding several consecutive FC layers? is decreasing the features smoothly helps for training a classifier?

Several FC layers in a row

Answers (1)

Related Questions