Reputation: 314
My Tensorflow Federated model is taking too long to converge. When I use the same model without TFF wrapping, training it with tensoflow 2.0, the accuracy reaches 0.97 within few epochs. However, with TFF training the same model is able to reach only 0.03 in 30 epochs. What could be the reason for such low accuracy during TFF training. Is there a way to improve this. My code is given below:
# Building the Federated Averaging Process
iterative_process = tff.learning.build_federated_averaging_process(
model_fn,
client_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=0.02),
server_optimizer_fn=lambda: tf.keras.optimizers.SGD(learning_rate=1.0))
str(iterative_process.initialize.type_signature)
state = iterative_process.initialize()
state, metrics = iterative_process.next(state, federated_train_data)
print('round 1, metrics={}'.format(metrics))
NUM_ROUNDS = 1000
for round_num in range(2, NUM_ROUNDS):
state, metrics = iterative_process.next(state, federated_train_data)
print('round {:2d}, metrics={}'.format(round_num, metrics))
Upvotes: 1
Views: 429
Reputation: 2941
There possibly is mixing of terminology here: depending on what an epoch means, in federated learning this may be expected.
If epoch is counting "rounds" (the for-loop in the code above): generally a round in federated learning is much smaller than an epoch in centralized learning. The global model is only updated once in a round, and those updates are trained on many fewer examples than the entire dataset. Often if a dataset has M
examples divided over K
clients, federated learning may select only a few of those clients to participate in a round, seeing only some multiple of M / K
examples that round.
Contrast with centralized learning, in which an epoch over the same dataset with M
examples and a training procedure using a batch size of N
would advance the model M / N
steps, and see all M
examples.
Generally it takes more rounds in federated learning to train a model than epochs in centralized learning, which can be thought of as caused by rounds being much smaller.
Upvotes: 2