Reputation: 43
I am training a model with different outputs in PyTorch, and I have four different losses for positions (in meter), rotations (in degree), and velocity, and a boolean value of 0 or 1 that the model has to predict.
AFAIK, there are two ways to define a final loss function here:
one - the naive weighted sum of the losses
two - the defining coefficient for each loss to optimize the final loss.
So, My question is how is better to weigh these losses to obtain the final loss, correctly?
Upvotes: 2
Views: 1977
Reputation: 41
That's a interesting problem. As @lvan said, this is a problem of optimization in a multi-objective.
The multi-loss/multi-task is as following:
l(\theta) = f(\theta) + g(\theta)
The l
is total_loss, f
is the class loss function, g
is the detection loss function.
The different loss function have the different refresh rate.As learning progresses, the rate at which the two loss functions decrease is quite inconsistent. Often one decreases very quickly and the other decreases super slowly.
There is a paper devoted to this question:
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
The main thinking of th paper estimate the uncertainty of each task, then automatically reducing the weight of the loss.
I am a non-native English speaker. Hope you can understand my answer and help you.
Upvotes: 2
Reputation: 40628
This is not a question about programming but instead about optimization in a multi-objective setup. The two options you've described come down to the same approach which is a linear combination of the loss term. However, keep in mind there are many other approaches out there with dynamic loss weighting, uncertainty weighting, etc... In practice, the most often used approach is the linear combination where each objective gets a weight that is determined via grid-search or random-search.
You can look up this survey on multi-task learning which showcases some approaches: Multi-Task Learning for Dense Prediction Tasks: A Survey, Vandenhende et al., T-PAMI'20.
This is an active line of research, as such, there is no definite answer to your question.
Upvotes: 2