Gordan84
Gordan84

Reputation: 33

Getting optimal rows based on two variables (Shipment_time, Shipment cost)

I am trying determine optimal solution based on two variables(columns) being time and cost. I need to decide which Shipment provider is most optimal based on delivery time and delivery cost.

I would like to know if there is any way to do it in SQL or Python(Pandas). The values are something like:

delivery time delivery cost
"1 day 24:55:38.765958" 11.3057446808510638
"2 days 13:44:12.084918" 8.6606336633663366
"2 days 21:47:49" 13.0000000000000000
"2 days 28:21:07.42914" 35.5322866894197952

Upvotes: 0

Views: 39

Answers (1)

jjanes
jjanes

Reputation: 44363

My understanding of pareto-optimal is that you want every row which has no rows which are better than it in both dimensions. That translates pretty readily into SQL:

select * from foo a where not exists (
    select 1 from foo b where b.delivery_time<a.delivery_time and b.delivery_cost < a.delivery_cost
);

I'm not sure what you should do in case of ties, though.

Upvotes: 1

Related Questions