IVR
IVR

Reputation: 43

Wrong shape Dataset Tensorflow

Im new to tensorflow and Im trying to feed some data with tensorflow.Dataset. Im using Cityscape dataset with 8 different classes. Here is my code:

import os
import cv2
import numpy as np
import tensorflow as tf

H = 256
W = 256
id2cat = np.array([0,0,0,0,0,0,0, 1,1,1,1, 2,2,2,2,2,2, 3,3,3,3, 4,4, 5, 6,6, 7,7,7,7,7,7,7,7,7])

def readImage(x):
    x = cv2.imread(x, cv2.IMREAD_COLOR)
    x = cv2.resize(x, (W, H))
    x = x / 255.0
    x = x.astype(np.float32)
    return x
    
def readMask(path):
    mask = cv2.imread(path, 0)
    mask = cv2.resize(mask, (W, H))
    mask = id2cat[mask]
    return mask.astype(np.int32)

def preprocess(x, y):
    def f(x, y):
        image = readImage(x)
        mask = readMask(y)
            
        return image, mask
        
    image, mask = tf.numpy_function(f, [x, y], [tf.float32, tf.int32])
    mask = tf.one_hot(mask, 3, dtype=tf.int32)
    image.set_shape([H, W, 3])
    mask.set_shape([H, W, 3])
    
    return image, mask
        

def tf_dataset(x, y, batch=8):
    dataset = tf.data.Dataset.from_tensor_slices((x, y))
    dataset = dataset.shuffle(buffer_size=5000)
    dataset = dataset.map(preprocess)
    dataset = dataset.batch(batch)
    dataset = dataset.repeat()
    dataset = dataset.prefetch(2)
    return dataset

def loadCityscape():
    trainPath = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'datasets\\Cityscape\\train')
    imagesPath = os.path.join(trainPath, 'images')
    maskPath = os.path.join(trainPath, 'masks')
    
    images = []
    masks = []
     
    print('Loading images and masks for Cityscape dataset...')
    for image in os.listdir(imagesPath):
        images.append(readImage(os.path.join(imagesPath, image)))
    for mask in os.listdir(maskPath):
        if 'label' in mask:
            masks.append(readMask(os.path.join(maskPath, mask)))
    print('Loaded {} images\n'.format(len(images)))
    
    return images, masks

images, masks = loadCityscape()

dataset = tf_dataset(images, masks, batch=8) 

print(dataset)

That last print(dataset) shows:

<PrefetchDataset shapes: ((None, 256, 256, 3), (None, 256, 256, 3)), types: (tf.float32, tf.int32)>

Why am I obtaining (None, 256, 256, 3) instead of (8, 256, 256, 3)? I also have some doubts about how to iterate over this dataset.

Thanks a lot.

Upvotes: 0

Views: 951

Answers (1)

DDomen
DDomen

Reputation: 1878

Tensorflow is a graph based mathematical framework that abstracts for you all of those complex vectorial or matricial operations you face, particularly in machine learning.

What the developers though is that it would be unconfortable to specify every single time how many input vectors you need to pass in your model for the training, so they decided to abstract it for you.

You will not interested if your model is fed with one single or thousands samples as long as the output matches with the input dimension (but also any internal operation should match in dimensions!).

So the None size is a placeholder for a possible changing shape, that is usually the batch size of the input.

We need a placeholder because (None, 2) is a different shape with respect of just (2,), because in the first case we know we will face 2 dimensions.

Even if the None dimension is unknown when you "compile" your model, it will be evaluated only when it is strictly needed, in other words when you run it. In this way your model will be happy to run on a batch size of 64 as like as 128 samples.

For the rest a (non-scalar) Tensor behaves like a normal numpy array:

tensor1 = tf.constant([ 0, 1, 2, 3]) # shape (4, )
tensor2 = tf.constant([ [0], [1], [2], [3]]) # shape (4, 1)
for x in tensor1:
  print(x) # 0, 1, 2, 3
for x in tensor2:
  print(x) # Tensor([0]), Tensor([1]), Tensor([2]), Tensor([3])

The only difference is that it can be allocated into any supported device memory (CPU / Cuda GPU).

Iterating through the dataset is just like slicing it at (usually) constant sizes, where that constant is your batch size, which will fill that empty None dimension.

This line of code will be responsible of slicing your dataset into "sub-tensors" ("sub-arrays") composed by its samples:

dataset = dataset.batch(N)
# iterating over it:
for batch in dataset: # I'm taking N samples here
  ...

Your "runtime" shape will be (N, 256, 256, 3), but if you will try to take an element from the dataset it could still have None in the shape... That's because we can't guarantee, for example, that the dimension of the dataset is exactly divisible by the batch size, so some trailing samples of a variable shape could still be possible. You will hardly get rid off that None dimension, but in some custom methods of your model you could achieve that.

If you are still unconfortable with tensors there is the tensor.numpy() method that gives you back a numpy array, but at the cost of copying it (usually to your CPU). This is not available in every step of the process.

There are many way to define a dataset in tensorflow, I suggest to read how they think you should build an input pipeline, because it will make your life easier if you understand how much tensorflow takes your code at higher levels of abstraction.

Upvotes: 1

Related Questions