christk
christk

Reputation: 872

Preprocess n files concurrently with tf.data API

I want to use tf.data.experimental.parallel_interleave to preprocess n files concurrently. cycle_length argument is used for this purpose but what is the maximum value of this argument? My CPU has 8 cores and 16 threads.

Upvotes: 0

Views: 85

Answers (1)

Sharky
Sharky

Reputation: 4533

As per official docs on tf.data.experimental.parallel_interleave

Unlike tf.data.Dataset.interleave, it gets elements from cycle_length nested datasets in parallel

and

cycle_length: The number of input Datasets to interleave from in parallel.

So basically, a reasonable argument would be number of dataset elements, which would be processed in parallel. In this way, it has no relation to CPU cores/threads

Upvotes: 1

Related Questions