Code7737
Code7737

Reputation: 41

Can't convert Python list to Tensorflow Dataset (InvalidArgumentError: Shapes of all inputs must match...)

I'm trying to make a neural network (using YT guide, but I had to change data input code) and I need the batched dataset for the train function to work properly (idk why, not event sure on it). But when I try to convert a train data list to Dataset using tensorflow.data.Dataset.from_tensor_slices(train_data)) I receive a error message:

InvalidArgumentError
{{function_node __wrapped__Pack_N_3_device_/job:localhost/replica:0/task:0/device:GPU:0}} Shapes of all inputs must match: values[0].shape = [105,105,3] != values[2].shape = [1] [Op:Pack] name: 0

The train_data list consists of 560 lists, each with 3 elements inside:

<tf.Tensor: shape=(105, 105, 3), dtype=float32, numpy = array([[["105x105 3-dimensional image with my face"]]]. dtype=float32)>
<tf.Tensor: shape=(105, 105, 3), dtype=float32, numpy = array([[["different image with the same properties"]]] dtype=float32)>
<tf.Tensor: shape=(1,), dtype=float32, numpy=array(["1. or 0. (float), a label, showing if these pictures are actually the pictures of the same person"], dtype=float32)>

I am pretty sure that all of the shapes in the train_data list are exactly as described.

Some data about shapes using .shape method

train_data.shape #"AttributeError: 'list' object has no attribute 'shape'" - main list
train_data[0].shape #"AttributeError: 'list' object has no attribute 'shape'" - sublist, with 3 elements
train_data[0][0].shape #"TensorShape([105, 105, 3])" - first image
train_data[0][0][0].shape #"TensorShape([105, 3])" - first row of image pixels, ig
train_data[0][0][0][0].shape #"TensorShape([3])" - pixel in the left upper corner

That's what I tried to do: The label of the image pairs (1. or 0.) was previosly just an integer. Then, I received an error saying that everything here should be the same type of float32. Then, I tried to convert it to tensor, but it changed nothing except the last part of the current error message, it used to say "values[2].shape = []" before.

I really have no idea what could lead to the error. I don't have any Tensorflow usage experience.

sorry if my engrish is bad

Edit: here is the code that takes the images out of certain directory. May cause eye bleeding

for i in os.listdir("t"):
    for ii in os.listdir(os.path.join("t", i)):
        td.append([
                   [
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\" + os.listdir(os.path.join("t", i, ii))[0])) / 255, 0), 
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\2.jpeg")) / 255, 0)],
                    tensorflow.convert_to_tensor(
                     float(
                      os.listdir(os.path.join("t", i, ii))[0][0]
                     )
                    )
                  ])

I added some spaces in order to make it a bit more readable. td = train_data. Yea, I could've messed something up there.

Edit 2: Answering Mohammad's question, there is the output data shape of the code they gave me:

td.shape #AttributeError: 'list' object has no attribute 'shape' - main list
td[0].shape #AttributeError: 'list' object has no attribute 'shape' - sublist, with a list and a label
td[0][0].shape #AttributeError: 'list' object has no attribute 'shape' - subsublist, with 2 images
td[0][1].shape #TensorShape([]) - label
td[0][0][0].shape #TensorShape([1, 105, 105, 3]) - first image
td[0][0][1].shape #TensorShape([1, 105, 105, 3]) - second image

It can be shown as:

train_data = [  [[x1, x2], y],  [[x1, x2], y], ... ]

Upvotes: 1

Views: 369

Answers (2)

Mohammad Ahmed
Mohammad Ahmed

Reputation: 1634

Your data is in this shape right now...

x1 = tf.random.normal((105, 105, 3))
x2 = tf.random.normal((105, 105, 3))
y = tf.random.normal((1,))

train_list = [[[x1,x2] , y] , [[x1,x2] , y] , [[x1,x2] , y] , [[x1,x2] , y]]

x1 = [train_list[x][:1][0][0] for x in range(len(train_list))]
x2 = [train_list[x][:1][0][1] for x in range(len(train_list))]
y = [train_list[x][1:] for x in range(len(train_list))]

tf.data.Dataset.from_tensor_slices(((x1 , x2) , y))
<TensorSliceDataset element_spec=((TensorSpec(shape=(105, 105, 3), dtype=tf.float32, name=None), TensorSpec(shape=(105, 105, 3), dtype=tf.float32, name=None)), TensorSpec(shape=(1, 1), dtype=tf.float32, name=None))>

Or Change the Code when you are Loading Images and Labels from Disks This will save time

x1 = []
x2 = []
y = []
for i in os.listdir("t"):
    for ii in os.listdir(os.path.join("t", i)):
        x1.append(
                    tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\" + os.listdir(os.path.join("t", i, ii))[0])) / 255, 0))
        x2.append(tensorflow.expand_dims(
                     tensorflow.io.decode_jpeg(
                      tensorflow.io.read_file(os.path.join("t", i, ii) + "\\2.jpeg")) / 255, 0)
                 )
        y.append(tensorflow.convert_to_tensor(
                     float(
                      os.listdir(os.path.join("t", i, ii))[0][0]
                     )
                    ))
tf.data.Dataset.from_tensor_slices(((x1 , x2) , y))

Upvotes: 0

Vijay Mariappan
Vijay Mariappan

Reputation: 17191

Replicating the problem:

x1 = tf.random.normal((105,105,3))
x2 = tf.random.normal((105,105,3))
y = tf.random.normal((1,))

array_list = [[x1, x2, y]] * 560
tf.data.Dataset.from_tensor_slices(array_list)
#InvalidArgumentError ... values[0].shape = [105,105,3] != values[2].shape = [1]

Fix:

#flatten to a single list
flatten_list = sum(array_list, [])

#Separate features and labels 
X = tf.squeeze(tf.stack(flatten_list[::3]))
y = tf.squeeze(tf.stack(flatten_list[2::3]))

#construct dataset iterator
ds = tf.data.Dataset.from_tensor_slices((X, y))
for data in ds.take(1):
    print(data)

Upvotes: 1

Related Questions