Reputation: 32081
I had 3 models that use the same input but produce 3 unique outputs (1 classifier, 2 regression). 2 of the 3 I combined into 1 model with 2 loss functions and saw a significant improvement in accuracy/RMSE.
I'm trying to combine the 3rd loss function into the model, so I have 1 model with 3 loss functions that share many parameters.
The 3rd loss function only applies to half the data though. I tested standardizing the labels to 0-mean-unit-variance and using 0 for the labels where they don't apply to loss function C, but that biased results towards 0 in some cases.
I'm now experimenting with alternating optimization on loss functions A & B together with a batch from the full dataset, vs all 3 loss functions A, B, & C with a batch appropriate for loss C (and A&B). In the context of my problem this is logical to do.
My Question:
Upvotes: 0
Views: 1075
Reputation: 32081
The dependency was with tensorboard, I had a summary operation on all loss functions, forcing them to be executed.
I split out my summary operations into groups using tf.add_to_collection() to gather different summary ops, then used a for loop to add them to the list of tensors to process as appropriate.
It was that and one other dependency that was just a bug that I found. @Sygi and @Fake are correct, you shouldn't need to pass in a value that isn't used in a particular computation just because it exists in the graph.
Upvotes: 1