data_person
data_person

Reputation: 4500

Dimension of output in Dense layer Keras

I have the sample following model

from tensorflow.keras import models
from tensorflow.keras import layers

sample_model = models.Sequential()
sample_model.add(layers.Dense(32, input_shape=(4,)))
sample_model.add(layers.Dense(16, input_shape = (44,)))

sample_model.compile(loss="binary_crossentropy",
                     optimizer="adam", metrics = ["accuracy"])

IP for the model:

sam_x = np.random.rand(10,4)
sam_y = np.array([0,1,1,0,1,0,0,1,0,1,])
sample_model.fit(sam_x,sam_y)

The confusion is the fit should have thrown an error of shape mismatch as the expected_input_shape for the 2nd Dense Layer is given as (None,44) but the output for the 1st Dense Layer (which is the input of the 2nd Dense Layer) will be of shape (None,32). But it ran successfully.

I don't understand why there was no error. Any clarifications will be helpful

Upvotes: 1

Views: 2333

Answers (3)

Innat
Innat

Reputation: 17239

The thing is after checking the input shape of the model from the first layer, it won't check or deal with other declared input shape inside that same model. For example, if you write your model the following way

sample_model.add(layers.Dense(32, input_shape=(4,)))
sample_model.add(layers.Dense(16, input_shape = (44,)))
sample_model.add(layers.Dense(8, input_shape = (32,)))

The program will always check the first declared input shape layer and discard the rest. So, if you start your first layer with input_shape = (44,), you need to pass exact feature numbers to your model as input such as:

sam_x = np.random.rand(10,44)
sam_y = np.array([0,1,1,0,1,0,0,1,0,1,])
sample_model.fit(sam_x,sam_y)

Additionally, if you look at the Functional API, unlike the Sequential model, you must create and define a standalone Input layer that specifies the shape of input data. It's not learnable but simply a spec layer. It's a kind of gateway of the input data for the model. That means even if we define input_shape inside the other layers, they all will be discarded. For example:

nputs = keras.Input(shape=(4,))

dense = layers.Dense(64, input_shape=(8,)) # dicard input_shape 
x = dense(inputs)

x = layers.Dense(64, input_shape=(16,))(x) # dicard input_shape 
outputs = layers.Dense(10)(x)

model = keras.Model(inputs=inputs, outputs=outputs, name="mnist_model")

Here is a more complex example with Conv2D and MNIST.

encoder_input = keras.Input(shape=(28, 28, 1),)
x = layers.Conv2D(16, 3, activation="relu",  input_shape=[32,32,3])(encoder_input)
x = layers.Conv2D(32, 3, activation="relu",  input_shape=[64,64,3])(x) 
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu",  input_shape=[224,321,3])(x)
x = layers.Conv2D(16, 3, activation="relu",  input_shape=[420,32,3])(x)
x = layers.GlobalMaxPooling2D()(x)
out = layers.Dense(10, activation='softmax')(x)

encoder = keras.Model(encoder_input, out, name="encoder")
encoder.summary()
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_15 (InputLayer)        [(None, 28, 28, 1)]       0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 26, 26, 16)        160       
_________________________________________________________________
conv2d_9 (Conv2D)            (None, 24, 24, 32)        4640      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 32)          0         
_________________________________________________________________
conv2d_10 (Conv2D)           (None, 6, 6, 32)          9248      
_________________________________________________________________
conv2d_11 (Conv2D)           (None, 4, 4, 16)          4624      
_________________________________________________________________
global_max_pooling2d_2 (Glob (None, 16)                0         
_________________________________________________________________
dense_56 (Dense)             (None, 10)                170       
=================================================================
Total params: 18,842
Trainable params: 18,842
Non-trainable params: 0
def pre_process(image, label):
    return (image / 256)[...,None].astype('float32'), 
            tf.keras.utils.to_categorical(label, num_classes=10)

(x, y), (_, _) = tf.keras.datasets.mnist.load_data('mnist')
encoder.compile(
          loss      = tf.keras.losses.CategoricalCrossentropy(),
          metrics   = tf.keras.metrics.CategoricalAccuracy(),
          optimizer = tf.keras.optimizers.Adam())

encoder.fit(x, y, batch_size=256)
4s 14ms/step - loss: 1.4303 - categorical_accuracy: 0.5279

Upvotes: 2

Lescurel
Lescurel

Reputation: 11651

The input_shape keyword argument has an effect only on the first layer of a Sequential. The shape of the input of the other layers will be derived from their previous layer.

That behaviour is hinted in the doc of tf.keras.layers.InputShape:

When using InputLayer with Keras Sequential model, it can be skipped by moving the input_shape parameter to the first layer after the InputLayer.

And in the Sequential Model guide.

The behaviour can be confirmed by looking at the source of the Sequential.add method:

if not self._layers:
  if isinstance(layer, input_layer.InputLayer):
    # Case where the user passes an Input or InputLayer layer via `add`.
    set_inputs = True
  else:
    batch_shape, dtype = training_utils.get_input_shape_and_dtype(layer)
    if batch_shape:
      # Instantiate an input layer.
      x = input_layer.Input(
          batch_shape=batch_shape, dtype=dtype, name=layer.name + '_input')
      # This will build the current layer
      # and create the node connecting the current layer
      # to the input layer we just created.
      layer(x)
      set_inputs = True 

If there is no layers yet in the model, then an Input will be added to the model with the shape derived from the first layer of the model. This is done only if no layer is present yet in the model.

That shape is either fully known (if input_shape has been passed to the first layer of the model) or will be fully known once the model is built (for example, with a call to model.build(input_shape)).

Upvotes: 3

snowdd1
snowdd1

Reputation: 156

I think Keras will create (or preserves to create) an additional Input Layer - but as the second dense layer is added using model.add() it will automatically be connected to the layer before, and thus the extra input layer stays unconnected and is not part of the model. (I agree that it would be nice of Keras to hint at unconnected layers, I sometimes created unconnected layers when using the functional API and changed the inputs. Keras doesn't remind me that I had jumped several layers, I just wondered why the summary() was so short...)

Upvotes: 1

Related Questions