Reputation: 2299
I have a matrix A
in Matlab of dimension MXN
. I want to create a vector B
of dimension Mx1
where B(i)=1
if A(i,:)
is never repeated in A
more than once and 0
otherwise.
For example
A=[1 2 3;
4 5 6;
1 2 3;
7 8 9];
B=[0;1;0;1];
This code
[vu,vx,vx]=unique(A,'rows');
n=accumarray(vx,1);
C=[vu n]
helps to find the number of occurrences per row. Hence, by adding a loop for example, I should be able from C
to get B
as desired. However, in my actual case M
is very large (80000). Is there anything quicker I could use?
Upvotes: 3
Views: 83
Reputation: 65430
The issue with your code is that the second output of unique
only returns the first occurance of each unique value and the third output returns an array of indices into the first output that correspond to the input. Neither of these can directly get you what you want. If you combine them, you can get what you want
[~, a, b] = unique(A, 'rows');
B = accumarray(a(b), 1, [size(A, 1), 1]) == 1;
Another alternative is to use the second output of ismember
with the 'rows'
option to find the indices of the shared rows. You can then use accumarray
to determine how many times a row is repeated and compare the result to 1
[~, bi] = ismember(A, A, 'rows');
B = accumarray(bi, 1, [size(A, 1), 1]) == 1;
In a simple benchmark, the option using unique
tends to yield the best result
function comparison()
nRows = round(linspace(100, 100000, 30));
times1 = nan(size(nRows));
times2 = nan(size(nRows));
for k = 1:numel(nRows)
A = randi(10, nRows(k), 4);
times1(k) = timeit(@()option1(A));
times2(k) = timeit(@()option2(A));
end
figure
p(1) = plot(nRows, times1 * 1000, 'DisplayName', 'unique');
hold on
p(2) = plot(nRows, times2 * 1000, 'DisplayName', 'ismember');
ylabel('Execution time (ms)')
xlabel('Rows in A')
legend(p)
end
function B = option1(A)
[~, a, b] = unique(A, 'rows');
B = accumarray(a(b), 1, [size(A, 1), 1]) == 1;
end
function B = option2(A)
[~, bi] = ismember(A, A, 'rows');
B = accumarray(bi, 1, [size(A, 1), 1]) == 1;
end
Upvotes: 10