Reputation: 3579

Efficient structure for storing pareto optimal solutions

I am trying to solve problem which requires storing pareto optimal solutions during calculation. I will call set of pareto optimal solutions a Bag.

So far I had only two criteria, which allowed for quite efficient solution based on an array in which elements were sorted in descending order according to the first criterion and ascending according to the second criterion. An example of such an array would be:

[(100, 0), (50, 1), (-10, 3)]

(about pareto optimality - wiki)

Recently however I found out that I need to add third criterion and for such an extension the above approach doesn't seem to be applicable. I tried to google whether someone already solved this but found nothing satisfactory. Perhaps I was asking google wrong question.

To be more precise about what I need: Structure capable of storing mutually non-dominating pareto optimal elements. I need to insert elements into the structure and I need to iterate over elements but in no particular order. In my case there won't usually more then 4-5 elements, but sometimes more up to 10-20. Insertions into the bag happen VERY often in my algorithm so I need them to be as fast as possible.

The application is written in C++, but it is probably not really relevant.

Any help would be much appreciated.

Edit: I already had some ideas of my own - arranging elements into some sort of triangular structure, but I am quite unable to formalize that idea.

Edit2: Please note that I require, that after each insertion, only mutually non-dominating elements remain in the structure. For example if I have set of non-dominating triples {(1,2,3), (3, 1, 1)} and add triple (3, 3, 3) I will get set {(3,3,3)}.

Edit3: To be more clear about dominance of elements - We say, in this particular case, that triple (a,b,c) dominates (e,f,g) if and only if a >= e && b >= f && c >= g and at least one of inequalities is strict - >.

Upvotes: 5

Answers (5)

dividebyzero

Reputation: 2239

A first thought might be to somehow use a std::set with a compare function that returns whether one element is Pareto-dominated by the other. But because (in C++ at least) element equivalence is determined from comparing the elements reflexively, that would mean element pairs that have no domination relationship would be considered equivalent, so you wouldn't be able to add them to the set. One way might be to use a multiset, and then remove all the "smaller" elements first before adding a new one. But I suspect complexity will not be better than in a vector, considering that all the elements will be "equivalent" every time you look for the dominated ones.

Anyway, I just wanted to point out that using the Pareto-dominance as the compare function straight into a set does not work. The really best solution probably requires more specialized data structures. Having a way to quickly check dominance on each dimension quickly really seems like the way to go in your case, as you have apparently been doing at first for only two dimensions. You have an index for each dimension, to speed up the check. Then for the element you can have a reverse index to allow you to quickly remove it. But that shouldn't be necessary if most of the time you are just making checks instead of modifying the set.

Upvotes: 1

MikeMB

Reputation: 21166

You could e.g. order them by their norm (like a*a + b*b + c*c), then you need only check the elements with a bigger norm if they dominate the new element and check only the elements with a smaller norm, if they are dominated by the new element.

However, I'm not sure if ordering of your elements is of particular value, if you only have so few of them to begin with. Maintaining that order (whatever it is) comes with its own overhead and might very well outweigh any benefits you get in terms of algorithmic complexity. In particular, I'd refrain from anything, that involves dynamic per-element allocation and deallocation, like std::list or std::map with standard allocators. Even a heap on an array might not bring a noticeable advantage.

I'd probably just use a plain, unsorted std::vector:

std::vector<Element> frontier;
void insert(const Element& newElement) {
    if (std::none_of(frontier.begin(), frontier.end(),  [&](const auto& e) {return dominates(e, newElement); })) {
        frontier.erase(std::remove_if(frontier.begin(), frontier.end(), [&](const auto& e) {return dominates(newElement, e); }),frontier.end());
        frontier.push_back(newElement);     
    }
}

Upvotes: 1

BitTickler

Reputation: 11937

The trival approach first, so we can see what we try to improve and validate, that this answer actually is relevant to the problem.

// Taken from the question and translated.
// Is the dominance condition.
let x_doms_y x y =
    let (a,b,c) = x
    let (e,f,g) = y
    a >= e && b >= f && c >= g &&
    (a > e || b > f || c > g)

The Naive approach would require O(n) tests, in order to filter out the existing elements in the data set, which are dominated by the new item to be added. The following shows the naive O(n) solution, which subsequently we try to improve.

type Triplet = int * int * int
type TripletSet = Triplet list

module Naive =
    let insert (t : Triplet) (tset : TripletSet) : TripletSet =
        t :: (tset|> List.filter (fun u -> not (x_doms_y t u)))

Starting with an empty list, then subsequently adding one triplet after the next yields:

let foo =
[] |> show
|> Naive.insert (1,2,3) |> show
|> Naive.insert (3,1,1) |> show
|> Naive.insert (3,3,3) |> show

> []
  [(1, 2, 3)]
  [(3, 1, 1); (1, 2, 3)]
  [(3, 3, 3)]

This appears to meet the expectations.

To improve the speed, next to insertion cost to the chosen data structure, which shall not be considered here, but which can be relevant, we try to get the number of dominance comparisons reduced to a value < n. On Average, at least.

The problem can be interpreted in a geometric sense. the triplet, e.g. 1,2,3 can be seen as a vector from one end of a cube, which is located at 0,0,0 to its diagonal corner.

Can a cube with smaller volume ever dominate a larger cube? The answer is no. We can show this by analogy on a 1-dimensional equivalent. If x < y, x cannot dominate y because to dominate, it should hold x >= y && X > y.

A similar equivalence can be conceived for 2 dimensions. And it holds in the same sense for our triplet.

Now, we narrowed our search space. Those items in the existing set, which have a smaller volume than the new triplet can be but need not be dominated by the new triplet. Those which have a higher volume than the new triplet cannot be dominated.

Hence, an improved approach would be:

let qset be the sorted sequence of quadlets, already inserted.
Let vt be the volume of the new triplet t : (a,b,c). vt = a * b * c
let qt = (vt,a,b,c)
Use binary search to find the index pos in qset, by volume.
All quadlets left of pos (with v < vt) are candidates for filtering.
All quadlets to the right of pos cannot be dominated as they are bigger.
So, now, we only have to apply the naive approach to the sub-set qset[0..pos-1]. If the values inserted are random regarding this relation, on average, we only have to filter n/2 where n is the size of qset.
Insert qt at pos into the qset and return qset.

Upvotes: 1

Jendas

Reputation: 3579

Naive solution (C++'ish pseudocode):

We store elements in vector vec. Then insertion might look like this:

void insert(const auto& e) {
    for (size_t i = 0; i < vec.size(); ++i) {
        if (e.dominates(vec.at(i))) {
            remove(vec.at(i));
        } else if (vec.at(i).dominates(e)) {
            return;
        }
    }
    vec.append(e);
}

This code would have to be more polished to efficiently remove elements, but I suspect this isn't the best we can get since we have to call dominates always for every element (if inserted element is not dominated) while in my solution I sort of had to check only up to first dominated already in the set and the rest was nicely eliminated.

Upvotes: 0

Codor

Reputation: 17605

If I understood the question correctly, you are looking for a suitable lexicographical order for triplets of integers. However it is unclear why you would like to store the Pareto frontier in a sorted manner, as you state you would like to iterate in no particular order. Perhaps a set (which is implemented in the standard library) is already sufficient.

Upvotes: 0

Efficient structure for storing pareto optimal solutions

Answers (5)

Related Questions