Mykola Zotko
Mykola Zotko

Reputation: 17892

Order of batch, prefetch, shuffle and cache in a Tensorflow dataset

When creating a dataset from a generator, what would be the correct order of the following dataset methods? Or does the order not matter here?

ds = tf.data.Dataset.from_generator(my_generator)
ds = ds.prefetch(tf.data.AUTOTUNE).shuffle(1000).batch(128).cache()

Here I use prefetch to speed up data generation and cache to avoid calculating after every epoch.

Upvotes: 2

Views: 946

Answers (1)

Khush patel
Khush patel

Reputation: 1

The correct order is Catch and then Prefetch. The reason is cache stores the dataset in fast memory and then prefetch prepares the next batch while the model is training on the current batch. In this manner model works on data stored in catch and after that we prepare next batch.

Upvotes: 0

Related Questions