Star
Star

Reputation: 2299

Find whether each row in Matlab is repeated or not

I have a matrix A in Matlab of dimension MXN. I want to create a vector B of dimension Mx1 where B(i)=1 if A(i,:) is never repeated in A more than once and 0 otherwise.

For example

A=[1 2 3;
   4 5 6; 
   1 2 3;
   7 8 9];

B=[0;1;0;1];

This code

[vu,vx,vx]=unique(A,'rows');
n=accumarray(vx,1);
C=[vu n]

helps to find the number of occurrences per row. Hence, by adding a loop for example, I should be able from C to get B as desired. However, in my actual case M is very large (80000). Is there anything quicker I could use?

Upvotes: 3

Views: 83

Answers (1)

Suever
Suever

Reputation: 65430

The issue with your code is that the second output of unique only returns the first occurance of each unique value and the third output returns an array of indices into the first output that correspond to the input. Neither of these can directly get you what you want. If you combine them, you can get what you want

[~, a, b] = unique(A, 'rows');
B = accumarray(a(b), 1, [size(A, 1), 1]) == 1;

Another alternative is to use the second output of ismember with the 'rows' option to find the indices of the shared rows. You can then use accumarray to determine how many times a row is repeated and compare the result to 1

[~, bi] = ismember(A, A, 'rows');
B = accumarray(bi, 1, [size(A, 1), 1]) == 1;

In a simple benchmark, the option using unique tends to yield the best result

function comparison()

    nRows = round(linspace(100, 100000, 30));

    times1 = nan(size(nRows));
    times2 = nan(size(nRows));

    for k = 1:numel(nRows)
        A = randi(10, nRows(k), 4);
        times1(k) = timeit(@()option1(A));
        times2(k) = timeit(@()option2(A));
    end

    figure
    p(1) = plot(nRows, times1 * 1000, 'DisplayName', 'unique');
    hold on
    p(2) = plot(nRows, times2 * 1000, 'DisplayName', 'ismember');
    ylabel('Execution time (ms)')
    xlabel('Rows in A')
    legend(p)
end


function B = option1(A)
    [~, a, b] = unique(A, 'rows');
    B = accumarray(a(b), 1, [size(A, 1), 1]) == 1;
end

function B = option2(A)
    [~, bi] = ismember(A, A, 'rows');
    B = accumarray(bi, 1, [size(A, 1), 1]) == 1;
end

enter image description here

Upvotes: 10

Related Questions