Reputation: 584
When I read this tutorial of tensorflow federated for image classification, I find .repeat()
, I would like to understand the necessity of this preprocess function, especially when I increase the number in .repeat()
, simulation takes a lot of time. So, if it is necessary to make .repeat()
,what number of epoch we can choose ?
Upvotes: 1
Views: 176
Reputation: 2941
The call to tf.data.Dataset.repeat
in the tutorial is not strictly necessary. It is a hyperparameter that causes the clients to perform more local optimization (take more steps by repeating their local dataset), which reduces the frequency of communication. In effect, more progress can happen each federated learning "round" because the clients are doing more work.
Not using any repeats means clients train for only one epoch. In the tutorial, this likely would result in model training appearing to be slower (on a per round basis), and less mini-batch steps are completed each round.
High number of epochs can have negative consequences. In later rounds, when the model is mostly converged, it can be detrimental for clients to overfit their local data in the case that it is divergent from the global data distribution.
Upvotes: 1