Henrik
Henrik

Reputation: 89

Numpy array loses a dimension converting to tf dataset

I am trying to convert numpy arrays into a tf.data.Dataset by using the following code:

train_dataset = tf.data.Dataset.from_tensor_slices((traininput, train[:, :, :, 1:4]))

However my dataset is now missing its first dimension. The numpy arrays both have the shape 1000, 128, 128, 3 and the dataset is reduced to the shape 128, 128, 3. This then leads to an error when trying to train my model: Error when checking input: expected input_2 to have 4 dimensions, but got array with shape (128, 128, 3) I have tried to work according to the tensorflow tutorial on loading numpy data. Why is that happening and how may I fix it?

As suggested I am providing a mcve below:

import tensorflow as tf
import numpy as np
inp = np.random.rand(100, 128, 128, 3)
out = np.random.rand(100, 126, 126, 3)
model = tf.keras.Sequential(
    [
        tf.keras.layers.InputLayer(input_shape=(128, 128, 3)),
        tf.keras.layers.Conv2D(
              filters=32, kernel_size=3, strides=(2, 2), activation='relu'),
        tf.keras.layers.Conv2DTranspose(
                      filters=3,
                      kernel_size=3,
                      strides=(2, 2),
                      padding="SAME",
                      activation='relu'),
    ]
)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
train_dataset = tf.data.Dataset.from_tensor_slices((inp, out))
model_history = model.fit(train_dataset, epochs=10)

It terminates with: Error when checking input: expected input_1 to have 4 dimensions, but got array with shape (128, 128, 3)

Upvotes: 1

Views: 676

Answers (2)

404pio
404pio

Reputation: 1032

I assume, that you have 100 images of size 128x128, which are 3-channel(RGB). And you network cannot get all of you images at once. It should get one image at one step. So you have 2 options:

  1. use for loop to iterate over you dataset, get one image input and one output image from dataset
  2. use batch. Tell you network that you will be using batches: tf.keras.layers.InputLayer(input_shape=(None, 128, 128, 3)) - Tensorflow batch size in input placholder

Upvotes: 0

pcarter
pcarter

Reputation: 1618

You need to set the batch size on the dataset so that it returns multiple examples instead of just one. This will also change the number of dimensions to be 4.

import tensorflow as tf
import numpy as np
inp = np.random.rand(100, 128, 128, 3)
# *** Had to set the last dim below to 1 to avoid another error with the accuracy
out = np.random.rand(100, 126, 126, 1)
model = tf.keras.Sequential(
    [
        tf.keras.layers.InputLayer(input_shape=(128, 128, 3)),
        tf.keras.layers.Conv2D(
              filters=32, kernel_size=3, strides=(2, 2), activation='relu'),
        tf.keras.layers.Conv2DTranspose(
                      filters=3,
                      kernel_size=3,
                      strides=(2, 2),
                      padding="SAME",
                      activation='relu'),
    ]
)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
train_dataset = tf.data.Dataset.from_tensor_slices((inp, out))
# *** Setting batch size of 10 below
train_dataset = train_dataset.batch(10)
model_history = model.fit(train_dataset, epochs=10)

Note: I had to change the last dimension of the out tensor to avoid a different error:

ValueError: Can not squeeze dim[3], expected a dimension of 1, got 3 for 'metrics/accuracy/Squeeze' (op: 'Squeeze') with input shapes: [?,126,126,3]

Upvotes: 2

Related Questions