How do I use a single neuron layer in the multiple layers of a multilayer network?

Question

Is it conceptually wrong to put a layer of multiple neurons after a single layer of neurons? if yes, How do I use this single neuron layer in the multiple layers of a multilayer network?

    model = Sequential()
    model.add(Input(shape=(10,))
    model.add(Dense(1,activation='relu'))
    model.add(Dense(5,activation='relu'))

Do I have to use a special layer? How?

In my application, the single neuron layer is a sum layer, which is as follows:

class sumLayer(Layer):
  def __init__(self,**kwargs):
    super(DefuzzyLayer, self).__init__(**kwargs)

  def call(self, x_inputs):
    xc = K.sum((x_inputs), axis=-1, keepdims=False)
    return tf.reshape(xc,(tf.shape(x_inputs)[0],1))

lejlot · Accepted Answer

It is definitely non standard to have any point in a deep neural network, where representation is collapsed to a single number. This creates an extreme information bottleneck, and thus, while in theory every complex decision can still be encoded, it will become extremely hard for the next layer to learn to reason about this space, if they need to react in a way that is not a very simple "thresholding" of your signal on the line. So, is it conceptually "wrong"? No, it is not really affecting representational power. Is it risky? Yes, it is something that experienced practicioners would never do unless there is a well understood reason to go this way.

To gain some intuition: one of the core reasons of effectiveness of neural networks is that they operate in very highly dimensional spaces, where their simple, affine transformations (neurons) can achieve surprising degree of data separation, but more importantly - they can be trained using extremely naive optimisation methods (gradient descent). These properties will be completely gone once dimensionality is reduced. In particular with an extreme representational bottleneck you are much more likely to be affected by local minima, that the early neural network community was struggling with (an issue that somewhat "magically" went away with the scale, and dimensionality going up; there is a whole field of research, e.g. Neural Tangent Kernels that provide mathematical fundations/understanding of why this is the case in high dimension).

How do I use a single neuron layer in the multiple layers of a multilayer network?

Answers (1)

Related Questions