Daniel Porumbel
Daniel Porumbel

Reputation: 309

best data structure to record a Pareto front

May I ask if someone has already seen or faced the following problem?

I need to handle a list of cost/profit values c1/p1, c2/p2, c3/p3,... that satisfies:

This is an example: 2/3, 4/5, 9/15, 12/19

If one tries to insert 10/14 in above list, the operation is rejected because of the existing cost/profit pair 9/12: it is never useful to increase the cost (9->10) and decrease the profit (14->12). Such lists can arise for instance in (the states of) dynamic programming algorithms for knapsack problems, where the costs can represent weights.

If one inserts 7/20 in above list, this should trigger the deletion of 9/15 and 12/19.

I have written a solution using the C++ std::set (often implemented with red-black trees), but I needed to provide a comparison function that eventually become a bit overly complex. Also, the insertion in such sets takes logarithmic time and that can easily actually lead to linear time (in terms of non-amortized complexity) for example when an insertion triggers the deletion of all other elements.

I wonder if better solutions exist, given that there are countless solutions to implement (ordered) sets, e.g., priority queues, heaps, linked lists, hash tables, etc.

This is a Pareto front (obj1: min cost, obj2: max profit), but I still could not find the best structure to record it.

Upvotes: 1

Views: 301

Answers (1)

Lajos Arpad
Lajos Arpad

Reputation: 76510

I did not fully understand the rules you described, so I will agnostically say that an attempt to an insertion might trigger rejection and if it is accepted, then subsequent items need to be removed.

You will need to use a balanced comparison tree, represented as an array. In that case, finding the nodes you need will take O(logN) time, which will be the complexity of a search or a rejected insertion attempt. When you need to remove items, then you remove them and insert a new one, which has a complexity of

O(logN + N + N + logN) (that is, searching, removing, rebalancing and inserting. We could get rid of the last logarithm if while rebalancing we knoe where the new item is to be inserted)

O(logN + N + N + logN) = O(2logN + 2N) = O(logN^2 + 2N), which is largely a linear complexity.

Upvotes: 0

Related Questions