saastn
saastn

Reputation: 6015

How to find number of unique elements in each row of a matrix?

I'm trying to implement some sort of interpolation algorithm. I is an N*4 matrix, holding indices of surrounding points of N other points. But elements in each row of I may not be unique, meaning two or more of them may refer to an individual point. I want to know how many unique indices are available in each row, and I want to do it as fast as possible since N is big!

Upvotes: 2

Views: 7719

Answers (3)

saastn
saastn

Reputation: 6015

Well, Mohsen's answer is a general solution for this problem, but arrayfun was too slow for me. So I thought a little more about it and found a much faster solution. I compare all pairs of columns and increase a counter if they were equal:

tic;
S = zeros(N, 1, 'uint32');
Nu = S+4; % in my case most of point are surrounded by four different points
for i=1:3
    for j=(i+1):4
        S = S + uint32(I(:, i)==I(:, j));
    end
end
% Nu(S==0) = 4;
Nu(S==1) = 3;
Nu((S==2)|(S==3)) = 2; % why? :)
Nu(S==6) = 1;
toc;

For N=189225, arrayfun takes 14.73s on my PC but summation takes only 0.04s.

Edit: Take care of different numbers of columns

Here's a modification of the code above. Now we can also have the places of unique values in each row! This one hasn't the :) problem and can be used for higher numbers of columns. Still taking 0.04s on my PC for 189225 rows.

tic;
uniq = true(N, 4);
for i=1:3
    for j=(i+1):4
          uniq(I(:, i)==I(:, j), j) = false;
    end
end
Nu = sum(uniq, 2);
toc;

Edit(2): Comparison with EBH's answer

After a while I needed this for another problem where I wanted number of unique elements in each row of matrices with different numbers of columns. So I compared my code with EBH's to see if their code is faster. I ran both codes on matrices with rows from 10K to 100K, and columns from 6 to 60. The results are average of spent time (in seconds) of 3 different runs:

enter image description here

I'm testing this in 2016a and there has been a significant improvement in performance of for-loops in latest versions of MATLAB. So you may need to to compare it yourself if you want to run it in older versions.

Upvotes: 1

EBH
EBH

Reputation: 10440

Here is a super fast way to do that without loops:

accumarray(repmat(1:size(I,1),1,size(I,2)).',I(:),[],@(x) numel(unique(x)))

This will give you a vector size N, where the element in place k is the number of unique elements in I(k,:).

Upvotes: 1

Mohsen Nosratinia
Mohsen Nosratinia

Reputation: 9864

You need to use unique function on each row and count the elements of the result:

arrayfun(@(x) numel(unique(I(x,:))), (1:size(I,1)).')

The index array is transposed so that the result is a column vector.

Upvotes: 2

Related Questions