Reputation: 33
I am trying determine optimal solution based on two variables(columns) being time and cost. I need to decide which Shipment provider is most optimal based on delivery time and delivery cost.
I would like to know if there is any way to do it in SQL or Python(Pandas). The values are something like:
delivery time | delivery cost |
---|---|
"1 day 24:55:38.765958" | 11.3057446808510638 |
"2 days 13:44:12.084918" | 8.6606336633663366 |
"2 days 21:47:49" | 13.0000000000000000 |
"2 days 28:21:07.42914" | 35.5322866894197952 |
Upvotes: 0
Views: 39
Reputation: 44363
My understanding of pareto-optimal is that you want every row which has no rows which are better than it in both dimensions. That translates pretty readily into SQL:
select * from foo a where not exists (
select 1 from foo b where b.delivery_time<a.delivery_time and b.delivery_cost < a.delivery_cost
);
I'm not sure what you should do in case of ties, though.
Upvotes: 1