Reputation: 1476
I'm using TensorFlow 2.0 and I have a batched dataset which contains 968 images and a label (4 element array) for each:
dataSetSize = allDataSet.reduce(0, lambda x, _: x + 1).numpy()
allDataSet = allDataSet.shuffle(dataSetSize)
allDataSet = allDataSet.map(processPath, num_parallel_calls=tf.data.experimental.AUTOTUNE)
allDataSet = allDataSet.batch(10)
predictions = loadedModel.predict(allDataSet)
onlyImages = # how to create this?
onlyLabels = # how to create this?
# the 'map' function in my dataset returns a batch of images and their corresponding labels
for idx, (imageBatch, labelBatch) in enumerate(allDataSet) :
# how to concatenate batches together?
onlyImages = # ?
onlyLabels = # ?
I need to separate this dataset into two numpy arrays. The first array should contain only the 968 images (shape: (968, 299, 299, 3)) and the second the 968 labels (shape: (968, 4)). How can I do that?
I checked a similar question here but these examples seem to be using Tensorflow 1.x and consist of a different input type?
Size of dataset and types:
dataset size: 968
<DatasetV1Adapter shapes: ((None, 299, 299, 3), (None, 4)), types: (tf.float32, tf.float32)
Upvotes: 1
Views: 1265
Reputation: 14983
If I understand your question well, what you need now to do is to concatenate to a numpy
array as you iterate through your dataset. Note that, during iteration, if you apply .numpy()
operation, you automatically convert from tf.tensor
to np.array
.
Therefore, the following options are available to you:
As per the documentation,
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=0)
Output is:
array([[1, 2],
[3, 4],
[5, 6]])
So, in your code, define an initial empty numpy array to which you
concatenate, on axis=0
(with imageBatch and labelBatch).
np.vstack
(np.concatenate
uses np.vstack
under the hood) which provides the same result.Upvotes: 1