Tensorflow Keras output layer shape weird error

Question

I am fairly new to TF, Keras and ML in general. I am trying to implement a very simple MLP with an input shape of (batch_size,3,2) and an output shape of (batch_size,3), that is (if I got it right): for every 3x2 feature, there is a corresponding 3 value array label.

Here is how I create the model:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
    tf.keras.layers.Dense(3)
])

and these are the X and y shapes:

X_train.shape,y_train.shape

TensorShape([64,3,2]),TensorShape([64,3])

On model.fit I am facing a weird error I cannot understand:

ValueError: Dimensions must be equal, but are 3 and 32 for ... with input shapes: [32,3,3] and [32,3]

I have no clue what's going on, I understand the batch size is 32, but where does that [32,3,3] comes from?

Moreover, if from the original 64, I lower the number (shapes) of X_train and y_train, say, to: (19,3,2) and (19,3), I get the following error instead:

InvalidArgumentError: required broadcastable shapes at loc(unknown)

What's even more weird for me is that if I specify a single unit for the output (last) layer, instead of 3 like this:

model = tf.keras.Sequential([
    tf.keras.layers.Dense(50,tf.keras.activations.relu,input_shape=((3,2)),
    tf.keras.layers.Dense(1)
])

model.fit works, but the predictions have shape (1,3,1) instead of my expected (3,)

I am very confused.

Kaveh · Accepted Answer

Whenever you have not any idea about the journey of data throughout your model, use model.summary() to see the details and what happens to the shape of data in each layer.

In this case, the input is a 2D array, and the output is a 1D array, and you just used dense layers. Dense layers can not handle 2d features in nature. For example for an image as input, you can not feed it directly to a dense layer. Instead you should use other layers such as Conv2D or Flatten your input (make it 1D) before feeding your data to the dense layer. Otherwise you will get the other dimension in the output.

Inference: If your input dimension and output dimension differs, somewhere in your model, the shape need to be changed. Most common ways to do so, is using a Flatten layer or GlobalAveragePooling and so on.

Tensorflow Keras output layer shape weird error

Answers (2)

Related Questions