MiniQuark
MiniQuark

Reputation: 48456

Why does the Keras API require the input shape in the first layer, since it actually works well without it?

I am using tf.keras from TensorFlow 1.9.0. It seems that everything works fine without specifying the input_shape in the first layer when building a Sequential model:

import tensorflow as tf
from tensorflow import keras
import numpy as np

X_train = np.random.randn(1000, 10)
y_train = np.random.randn(1000)

model = keras.models.Sequential([
    keras.layers.Dense(32, activation="relu"),
    keras.layers.Dense(1),
])
optimizer = tf.train.MomentumOptimizer(0.01, 0.9)
model.compile(loss="mse", optimizer=optimizer)
model.fit(X_train, y_train, epochs=10)

This makes sense to me because Keras can easily get the input shape from the inputs (X_train). I also tried using PyPI Keras, and it works fine too (with the TensorFlow 1.9.0 backend).

So I have two questions:

  1. why does the official Keras API require the input shape to be specified in the first layer? Is it because it is necessary for backends other than TensorFlow?
  2. is this behavior officially supported by tf.keras, even though it is not officially supported by the Keras API? In other words, if I do not care about other backends, is it legal to do this (not specify the input shape), or is there a risk it will break in future versions?

Thanks!

Upvotes: 4

Views: 1811

Answers (1)

P-Gn
P-Gn

Reputation: 24591

I think the choice of words in the guide (emphasis mine),

The model needs to know what input shape it should expect.

may be a bit unfortunate and, being a guide, should not been taken as a specification.

Keras can accept input_shapes that are partially (None, n) or completely (None, None) unknown. The later is actually needed to create FCN networks that scale with the size of the input.

So it you know that you don't know the input shape, I would suggest to explicitly provide such a partially/completely undefined input_shape. It makes for self-documenting code claiming that the input shape is unknown, rather than casting some doubt on whether this argument is missing.

Upvotes: 4

Related Questions