Why does Keras allow predictions from incorrect input shape tensors?

Question

Let's consider a simple CNN, trained on 1000 images of fixed input_shape = (28,28,1)

Surprisingly, it will allow predicting images of different shape such as (28,30,1)

Shouldn't it fail instead of silently predict? If not, why? (reproducable code below)

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import models

X_train = tf.ones((1000,28,28,1))
y_train = tf.ones((1000,10))

model = models.Sequential()
model.add(layers.Conv2D(8, (4,4), input_shape=(28, 28, 1), activation='relu', padding='same'))
model.add(layers.MaxPool2D(pool_size=(2,2)))
model.add(layers.Conv2D(16, (3,3), activation='relu', padding='same'))
model.add(layers.MaxPool2D(pool_size=(2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(10, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X_train,y_train)

model.layers[0]._build_input_shape # >>>TensorShape([None, 28, 28, 1])

Once fitted, it can also (surprisingly) predict images of wrong_input. Wouldn't it be better for it to fail?

X_wrong = tf.ones((1000,28,30,1))
model.predict(X_wrong).shape  # works, and return (1000, 10)
model.layers[0](X_wrong).shape # also works (1000, 28, 32, 8)

I understand that prediction does works in principle because numbers of weights still match, thanks to maxpooling:

x = model.layers[0](X_wrong)
print(x.shape)
x = model.layers[1](x)
print(x.shape)
x = model.layers[2](x)
print(x.shape)
x = model.layers[3](x)
print(x.shape)
x = model.layers[4](x)
print(x.shape)
x = model.layers[5](x)
print(x.shape)

>>>(1000, 28, 30, 8)
>>>(1000, 14, 15, 8)
>>>(1000, 14, 15, 16)
>>>(1000, 7, 7, 16)
>>>(1000, 784)
>>>(1000, 10)

Increasing shape a bit more will fail for good, as expected because 896 != 784

X_wrong_fail = tf.ones((1000,28,32,1))
>>>(1000, 28, 32, 8)
>>>(1000, 14, 16, 8)
>>>(1000, 14, 16, 16)
>>>(1000, 7, 8, 16)
>>>(1000, 896)
>>> ValueError: Input 0 of layer dense_2 is incompatible with the layer: expected axis -1 of input shape to have value 784 but received input with shape (1000, 896)

But it would have been better to fail in both cases imo. Don't you agree?

Why does Keras allow predictions from incorrect input shape tensors?

Answers (1)

Related Questions