Reputation: 1130
Many threads here show that accumarray
is the answer in matlab for grouping (and calculating) values by index sets. As this works fast and fine, I need to have somethin similar for bigger (ND) data-fields.
Let's assume an example: We have a names-vector with (non unique) names inside and a data vector of earnings for different projects columnwise
names=str2mat('Tom','Sarah','Tom','Max','Max','Jon');
earnings=[100 200 20;20 100 500;1 5 900; 100 200 200;200 200 0; 300 100 -250];
and now we want to calculate the column-sums for each name.
Okay, we can find out the indices by
[namesuq,i1,i2]=unique(names,'rows')
but after that, the obvious call
accumarray(i2,earning)
is not working.
One could of course use a for-loop over the unique names or the rows, but that might be a little bit inefficient. Are there better ideas?
Additionally I tried
accumarray(i2,@(i)earnings(i,:))
but this is not implemented and results in
Error using accumarray
Second input VAL must be a full numeric, logical, or char vector or scalar.
Thanks for ideas.
Additions: Thanks to eitan-t for his solution which is great for the example.
So sad, my minimum working example did not show all needs: The function which I want to aplly needs a whole row or later maybe a complete maxtrix, which I need to group over a 3rd or even higher dimension.
Maybe to make that clearer: Think of a matrix M
in size a x b x c
and each entry in a
corresponds to a name or so. My need is best described for example summing up for all unique names.
Naive programming would be
nam=unique(names);
for ind=1:size(nam,2)
N(ind,:,:)=sum(M(nam(ind)==names,:,:),1);
end
Is this clear? Are there solutions herefor?
Upvotes: 1
Views: 1019
Reputation: 32930
Based on this answer, the solution you're looking for is:
[namesuq, i1, i2] = unique(names, 'rows');
[c, r] = meshgrid(1:size(earnings, 2), i2);
y = accumarray([r(:), c(:)], earnings(:));
where the rows match names(i1, :)
.
%// Sample input
names = str2mat('Tom', 'Sarah', 'Tom', 'Max', 'Max', 'Jon');
earnings = [100 200 20; 20 100 500; 1 5 900; 100 200 200; 200 200 0; 300 100 -250]
%// Sum along columns by names
[namesuq, i1, i2] = unique(names, 'rows');
[c, r] = meshgrid(1:size(earnings, 2), i2);
y = accumarray([r(:), c(:)], earnings(:))
The resulting sums are:
y =
300 100 -250
300 400 200
20 100 500
101 205 920
which correspond to names(i1, :)
:
ans =
Jon
Max
Sarah
Tom
Upvotes: 2