Jihoon Seo
Jihoon Seo

Reputation: 35

Why tensor shape is difference when i use tf.print?

I made simple dataset like below.

x_data = [[0, 0],
          [0, 1],
          [1, 0],
          [1, 1]]
y_data = [[0],
          [1],
          [1],
          [0]]

And I slice it by using from_tensor_slices: (I don't know exact role of tensor slice function...)

dataset = tf.data.Dataset.from_tensor_slices((x_data, y_data)).batch(len(x_data))

when I print dataset using print function, it shows like below:

<BatchDataset shapes: ((None, 2), (None, 1)), types: (tf.int32, tf.int32)>

and when I print it using for loop it show like below:

tf.Tensor(
[[0 0]
 [0 1]
 [1 0]
 [1 1]], shape=(4, 2), dtype=int32) 
tf.Tensor(
[[0]
 [1]
 [1]
 [0]], shape=(4, 1), dtype=int32)

Here is question:

In my idea, tensor shape should be (4,2) and (4,1) because row of matrix is 4.

Why when I use print, it shows (None,2) and (None,1)?

And how to print value of tensor without for loop?

Upvotes: 1

Views: 60

Answers (1)

Kaveh
Kaveh

Reputation: 4990

1- What is from_tensor_slices?

  • When you use from_tensor_slices it creates a tensorflow dataset from your input tensors.

2- What is the benefits of using a tensorflow dataset?

  • It makes everything you need to do with a dataset, very easy. i.e. you can easily make them shuffle, batch,preprocess data by map and even easily feed to your model like model.fit(dataset) etc.

3- Why print function shows BatchDataset not the values?

  • dataset variable is an object from BatchDataset class (since you defined it like dataset=from_tensor_slices((x,y)).batch(bs)). It is not a python list, eager tensor, numpy array and ... to see its values by print function.

4- What can I do to see the values stored in a tf dataset?

  • You can access its values by using take() function from this class:
one_batch = dataset.take(1) # it takes 1 batch of data from dataset

# each batch is a tuple (like what you passed in from_tensor_slices) 
# you passed x and y. So, it returns a batch of x and y
for x,y in one_batch:      
    print(x.shape)
    print(y.shape)
#(4,2) (batch_size, num_features)
#(4,1) (batch_size, labels_dim)

5- What are (None,2) and (None,1) in BatchDataset object variable?

  • It is the size of x=(None,2) and y=(None,1). First dimension is None. None in the shapes, means that the first dimension of x in this dataset (first dimension is number of samples) can be anything, but the second dimension is 2. And the same rule for y.

6- How to print values without for loop?

  • Actually for performance dealing it acts like generators. You can not print all values in once. You can access its elements one by one (batch by batch).

Upvotes: 1

Related Questions