scenia
scenia

Reputation: 1629

Vectorizing a pareto front algorithm

First of all, here's my setup:

The task is to find the pareto optimal set of value pairs (x,y), that is, all pairs that are not dominated. A pair is called dominated if there exists another pair (u,v) such that u <= x && v <= y and one of these comparisons is strict: u < x || v < y. In other words, a pair is dominated if another pair is better in one value and not worse in the other.

My research up to this point has yielded three working algorithms, that are unfortunately all reliant on loops. Here's how they work and what time I got for running them with x, y and a of length 1e8:

  1. Sort x in ascending order. Add the first pair to the pareto set.
  2. Loop through x. Add each pair to the pareto set where y is lower than the previous pareto pair's y.

Elapsed time is 80.204052 seconds.

 

  1. Find min(x). Add that pair to the pareto set.
  2. Select all pairs where y is lower than the previously added pair's y.
  3. Go to step 1 unless step 2 resulted in the empty set.

Elapsed time is 2.993350 seconds.

 

  1. Loop through all pairs (x,y).
  2. Remove all pairs (u,v) with x >= u && y >= v.

Elapsed time is 105.924814 seconds.

Now what I'm attempting to do is create a vectorized algorithm. It doesn't have to be based on one of the above, but I wasn't able to find any other working algorithm. The best I could do is this:

ap = a(y < min(y(x == min(x))) | x < min(x(y == min(y))));

which does usually find all pareto optimal pairs, but includes all pairs not dominated by the pair at min(x) or min(y), even if one dominates another. I say usually because it fails entirely if there is only one globally optimal pair that dominates every other pair. Replacing the < with <= cures the second problem, but finds even more dominated pairs (those that have only one worse value). I also ran this through the same timer as above:

Elapsed time is 0.800385 seconds.


Here's a test script I'm using to check how the algorithm does, feel free to use it

for i=1:25
    x = randi(8,10,1);
    y = randi(8,10,1);
    a = 1:10;
    ap = a(y < min(y(x == min(x))) | x < min(x(y == min(y)))); %// algorithm here
    figure(1);
    subplot(5,5,i);
    plot(a,x,'b',a,y,'r',ap,x(ap),'b.',ap,y(ap),'r.','MarkerSize',20);
    axis([0,11,0,9]);
    set(gca,'XGrid','on','YGrid','on','XTick',1:10,'YTick',0:8);
    figure(2);
    subplot(5,5,i);
    plot(x,y,'b.',x(ap),y(ap),'ro','MarkerSize',10);
    axis([0,9,0,9]);
end

Upvotes: 4

Views: 1300

Answers (1)

A Hernandez
A Hernandez

Reputation: 484

So, if speed is the primary characteristic (after correctness), then I found a recursive version of the faster loop version to be over 30% faster:

>> testPareto(1e8);
Recursive:
Elapsed time is 4.507267 seconds.
Loop:
Elapsed time is 6.136856 seconds.
Vector:
Elapsed time is 7.246806 seconds.

Again, the time depends on the machine and it might even depend on matlab's version. Here is the code:

function testPareto(dim)

x = rand(dim, 1);
y = rand(dim, 1);

tic;
rR = paretoRecursive(x, y);
disp('Recursive:');
toc;

tic;
rL = paretoLoop(x, y);
disp('Loop:');
toc;

tic;
rV = paretoVector(x, y);
disp('Vector:');
toc;

end

function result = paretoLoop(x, y)
    result = zeros(numel(x), 2);
    off = 1;
    loop = true;
    while loop
        xmin = min(x);
        ymin = min(y(x == xmin));
        yfilter = y < ymin;
        result(off, :) = [xmin ymin];
        off = off + 1;
        if any(yfilter)
            x = x(yfilter);
            y = y(yfilter);
        else
            loop = false;
            result(off:end, :) = [];
        end
    end
end

function result = paretoRecursive(x, y)
    xmin = min(x);
    ymin = min(y(x == xmin));
    yfilter = y < ymin;
    if any(yfilter)
        result = [xmin ymin; paretoRecursive(x(yfilter), y(yfilter))];
    else
        result = [xmin ymin];
    end
end

function result = paretoVector(x, y)
    xmin = min(x);
    xfilter = x == xmin;
    ymin = min(y(xfilter));
    yfilter = y < ymin;
    if any(yfilter)
        [x, ind] = sort(x(yfilter));
        y = y(yfilter);
        y = y(ind);
        yfilter = [true; y(2:end) < cummin(y(1:end-1))];
        result = [xmin x(yfilter)'; ymin y(yfilter)']';
    else
        result = [xmin ymin];
    end
end

Upvotes: 1

Related Questions