questionmark
questionmark

Reputation: 345

Dimension/Shape Error when running model.fit

I am trying to use Tensorflow and Keras for a prediction model.

I first read in my dataset that has shape (7709, 58), then normalize it:

normalizer = tf.keras.layers.Normalization(axis=-1)
normalizer.adapt(np.array(dataset))

Then I split the data into training and testing data:

train_dataset = dataset[:5000]
test_dataset = dataset[5000:]

I prepare those datasets:

train_dataset.describe().transpose()
test_dataset.describe().transpose()

train_features = train_dataset.copy()
test_features = test_dataset.copy()

train_labels = train_features.pop('outcome')
test_labels = test_features.pop('outcome')

Then I build the model:

def build_and_compile_model(norm):
  model = keras.Sequential([
      norm,
      layers.Dense(64, activation='relu'),
      layers.Dense(64, activation='relu'),
      layers.Dense(1)
  ])

  model.compile(loss='mean_squared_error', metrics=['mean_squared_error'],
                optimizer=tf.keras.optimizers.Adam(0.001))
  return model

dnn_model = build_and_compile_model(normalizer)

When I then try to fit the model, it fails:

history = dnn_model.fit(
    test_features,
    test_labels, 
    validation_split=0.2, epochs=50)

Gives the following error:

ValueError: in user code:

    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 859, in train_step
        y_pred = self(x, training=True)
    File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None

    ValueError: Exception encountered when calling layer "normalization_7" (type Normalization).
    
    Dimensions must be equal, but are 57 and 58 for '{{node sequential_7/normalization_7/sub}} = Sub[T=DT_FLOAT](sequential_7/Cast, sequential_7/normalization_7/sub/y)' with input shapes: [?,57], [1,58].

What is the issue and how can I address it?

Upvotes: -1

Views: 295

Answers (2)

PlzBePython
PlzBePython

Reputation: 406

bui is correct that pop is the problem. However, I would keep pop but move the "normalizer.adapt" method behind pop. This way you don't fit the normalizer to your labels (which does not make sense) and you don't use the labels as a feature (which could be terrible).

Upvotes: 0

bui
bui

Reputation: 1651

You lost the outcome column in the dataframe because of pop. Try extracting that column using

train_labels = train_features['outcome']
test_labels = test_features['outcome']

Upvotes: 1

Related Questions