Reputation: 17892
When creating a dataset from a generator, what would be the correct order of the following dataset methods? Or does the order not matter here?
ds = tf.data.Dataset.from_generator(my_generator)
ds = ds.prefetch(tf.data.AUTOTUNE).shuffle(1000).batch(128).cache()
Here I use prefetch
to speed up data generation and cache
to avoid calculating after every epoch.
Upvotes: 2
Views: 946
Reputation: 1
The correct order is Catch and then Prefetch. The reason is cache stores the dataset in fast memory and then prefetch prepares the next batch while the model is training on the current batch. In this manner model works on data stored in catch and after that we prepare next batch.
Upvotes: 0