Reputation: 27592
is there any distributed data reader for TensorFlow? if not, what is the suggested way for handling big datasets over multiple machines?
The distributed inception example here pre-segments the data over multiple machines and then each workers grabs its subset of data from available subsets. Is this the only supported method?
Also it seems that some of the data readers described in here are thread safe but I couldn't find any distributed solution.
Upvotes: 1
Views: 235
Reputation: 2878
We don't currently have a general solution for distributed data reading in TensorFlow, and it's a bit of a hard problem since there are many different possible requirements around latency, size, and sharding. I'd be very interested in any proposals or patches though!
Upvotes: 2