Star
Star

Reputation: 2299

Construct a Matlab array through nested loops

Suppose I have three matrices A_1, A_2, A_3 each of dimension mxn, where m and n are large. These matrices contain strictly positive numbers.

I want to construct a matrix check of dimension mx1 such that, for each i=1,...,m:

check(i)=1 if there exists j,k such that

A_1(i,1)+A_2(j,1)+A_3(k,1)<=quantile(A_1(i,2:end)+A_2(j,2:end)+A_3(k,3:end), 0.95)

In my case m is large (m=10^5) and n=500. Therefore, I would like your help to find an efficient way to do this.


Below I reproduce an example. I impose m smaller than in reality and report my incomplete and probably inefficient attempt to construct check.

clear
rng default
m=4;
n=500;
A_1=betarnd(1,2,m,n);
A_2=betarnd(1,2,m,n);
A_3=betarnd(1,2,m,n);
check=zeros(m,1);
for i=1:m
    for j=1:m
        for k=1:m
            if A_1(i,1)+A_2(j,1)+A_3(k,1)<=quantile(A_1(i,2:end)+A_2(j,2:end)+A_3(k,2:end), 0.95)
              check(i)=1;
              STOP TO LOOP OVER j AND k, MOVE TO THE NEXT i (INCOMPLETE!)              
           else
            KEEP SEARCHING FOR j,k SUCH THAT THE CONDITION IS SATISFIED (INCOMPLETE!) 
           end
       end
    end
end

Upvotes: 4

Views: 165

Answers (2)

rahnema1
rahnema1

Reputation: 15837

Given a scalar x and a vector v the expression x <=quantile (v, .95) can be written as sum( x > v) < Q where Q = .95 * numel(v) *.

Also A_1 can be splitted before the loop to avoid extra indexing. Moreover the most inner loop can be removed in favor of vectorization.

Af_1 = A_1(:,1);
Af_2 = A_2(:,1);
Af_3 = A_3(:,1);
As_1 = A_1(:,2:end);
As_2 = A_2(:,2:end);
As_3 = A_3(:,2:end);
Q = .95 * (n -1);
for i=1:m
    for j=1:m
        if any (sum (Af_1(i) + Af_2(j) + Af_3 > As_1(i,:) + As_2(j,:) + As_3, 2) < Q)
            check(i) = 1;
            break; 
        end             
    end
end

More optimization can be achieved by rearranging the expressions involved in the inequality and pre-computation:

lhs = A_3(:,1) - A_3(:,2:end);
lhsi = A_1(:,1) - A_1(:,2:end);
rhsj = A_2(:,2:end) - A_2(:,1);
Q = .95 * (n - 1);
for i=1:m
    LHS = lhs + lhsi(i,:);
    for j=1:m
        if any (sum (LHS > rhsj(j,:), 2) < Q)
            check(i) = 1;
            break; 
        end             
    end
end
  • Note that because of the method that is used in the computation of quantile you may get a slightly different result.

Upvotes: 4

Zizy Archer
Zizy Archer

Reputation: 1390

Option 1: Because all numbers are positive, you can do some optimizations. 95 percentile will be only higher if you add A1 to the mix - if you find the j and k of greatest 95 percentile of A2+A3 on the right side compared to the sum of the first 2 elements, you can simply take that for every i.

maxDif = -inf;
for j = 1 : m
   for k = 1 : m
      newDif = quantile(A_2..., 0.95) - A_2(j,1)-A_3(k,1);
      maxDif = max(newDif, maxDif);
   end
end

If even that is too slow, you can first get maxDifA2 and maxDifA3, then estimate that maxDif will be for those particular j and k values and calculate it.

Now, for some numbers you will get that maxDif > A_1, then the check is 1. For some numbers you will get that maxDif + quantile(A1, 0.95) < A_1, here check is 0 (if you estimated maxDif by separate calculation of A2 and A3 this isn't true!). For some (most?) you will unfortunately get values in between and this won't be helpful at all. Then what remains is option 2 (it is also more straightforward):

Option 2: You could save some time if you can save summation A_2+A_3 on the right side, as that calculation repeats for every different i, but that requires A LOT of memory. But quantile is the more expensive operation anyway, so you aren't saving a lot of time. Something along the lines of

for j = 1 : m
for k = 1 : m
A23R(j,k,:) = A2(j,:)+A3(k,:); % Unlikely to fit in memory.
end
end

Then you can perform your loops, using A23R and avoiding to repeat that sum for every i.

Upvotes: 0

Related Questions