Reputation: 41
I am working with TensorFlow Federated framework and designed a keras model for a binary classification problem. I defined the iterative process with tff.learning.build_federated_averaging_process
and broadcasted the model with
state, metrics = iterative_process.next(state, train_data)
After the above steps are executed I tried to run the prediction,
model_test=create_keras_model() # function defining the binary classification model
model_test.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
pred_out=model_test.predict(a[0].take(20)) # a[0] is the dataset constructed with the function
create_tf_dataset_for_client()
classes =( pred_out >0.5 ).astype("int32")
np.unique(classes)
array([[0],
[1],
[0],
[0],
[1],
[1],
[1],
[0],
[0],
[1],
[1],
[0],
[1],
[1],
[0],
[0],
[0],
[1],
[1],
[0]], dtype=int32)
But after applying the tff learning model weights of the state to the model, the prediction is not working as expected. It is showing the same value for all the rows.
model_test=create_keras_model() # function defining the binary classification model
state.model.assign_weights_to(model_test)
pred_out=model_test.predict(a[0].take(20)) # a[0] is the dataset constructed with the function
create_tf_dataset_for_client()
print(pred_out)
array([[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368],
[-0.2798368]], dtype=float32)
Upon consecutive research, I understood that the the above value '-0.2798368' is the value in state Modelweights
print(state.model.assign_weights_to(keras_model))
ModelWeights(trainable=[array([[-4.984627 , -5.193449 , -5.790202 ,
-5.5200233 , -5.5461893 ,
-4.977145 , -5.4065394 , -5.619186 , -5.3337646 , -5.136057 ],
[-0.5657665 , -5.8657775 , -5.3425145 , -5.2261133 , -5.330576 ,
-5.9684296 , -5.4551187 , -5.3567815 , -4.8706098 , -5.7063856 ],
[-5.6153154 , -5.9375963 , -5.4587545 , -5.689524 , -5.463484 ,
-4.9066486 , -5.752383 , -0.3759068 , -5.4120364 , -5.8245053 ],
[-5.2911777 , -5.42058 , -5.932811 , -5.4922986 , -0.41761395,
-5.432293 , -5.309703 , 0.31641293, -5.635701 , -5.7644367 ],
[ 0.07086992, -5.0122833 , -5.2278 , -5.2102866 , -0.03762579,
-0.43286362, -4.865974 , -0.3707862 , -5.9437294 , -5.1678157 ],
[-5.6853213 , -5.467271 , -5.7508802 , -5.4324217 , -5.3518825 ,
-5.033523 , -4.8834076 , -4.8871975 , -5.9014115 , -5.3266053 ],
[-5.280035 , -5.763103 , -5.828321 , -5.780304 , -5.908666 ,
-5.6955295 , -5.6714606 , -4.9686913 , -4.898386 , -5.12075 ],
[-4.8388877 , -5.7745824 , -5.1134114 , -5.779592 , -5.616187 ,
-4.870717 , -5.131807 , -5.9274936 , -5.345783 , -5.113287 ]],
dtype=float32), array([-5.4049463, -5.4049444, -5.404945 , -5.404946 ,
-5.404945 ,
-5.4049444, -5.404945 , -5.404945 , -5.4049454, -5.4049444],
dtype=float32), array([[ 4.972922 ],
[-4.823935 ],
[ 4.916144 ],
[ 5.0096955],
[-4.9212008],
[-5.1436653],
[ 4.8211393],
[-4.8939514],
[ 5.1752467],
[-5.01398 ]], dtype=float32), **array([-0.2798368]**, dtype=float32)],
non_trainable=[])
Any guidance/suggestions here as where am I going wrong?
Upvotes: 1
Views: 340
Reputation: 2941
We might need to step back a think about how the system models federated computation to understand what is meant by "server model" at one points in time. The SERVER
and CLIENTS
concepts exist in a different layer of abstraction that the python runtime the script is executing in. Meaning the code that constructs a Keras model in Python is "outside" the "federated context" that has those notions of placement.
# TFF doesn't know about this model, it doesn't exist at a "placement",
# i.e. it is neither SERVER nor CLIENTS placed.
model = create_keras_model()
learning_process = tff.learning.build_federated_averaging_process(...)
# During the call to `initialize` a "federated context" exists, which runs
# a `tff.Computation` called `initialize` that creates a value placed at
# SERVER. However, once the function "returns back to Python", the "state"
# variable we have below no longer has any "placement", its just "in Python".
state = learning_process.initialize()
# When we pass "state" back into the `next` method, it is given placement again
# based on the type signature of `next`. In this case, its placed back at
# SERVER and the placement is used _during_ the invocation of `next`. Again,
# once `next` returns, the notion of placements goes away; we're back "in
# Python" without placement.
state, metrics = learning_process.next(state, data)
In the code above model
could be called the "server model", it will initially have the same weights, but it is not the SERVER
placed model referred to in the TFF API documentation. The documentation only refers to values during the invocation of a tff.Computation
(e.g. initialize
and next
).
In other words, model
and state
are not connected. Updating one will not update the other. To use model
with newly trained weights (e.g. after a next
call). The code must assign the state
weights back to the model
(as done in the quesiton):
state.model.assign_weights_to(model)
Upvotes: 1