Bastian Ebeling
Bastian Ebeling

Reputation: 1130

grouping and summing (evaluating functions) on matrix-values in matlab

Many threads here show that accumarray is the answer in matlab for grouping (and calculating) values by index sets. As this works fast and fine, I need to have somethin similar for bigger (ND) data-fields.
Let's assume an example: We have a names-vector with (non unique) names inside and a data vector of earnings for different projects columnwise

names=str2mat('Tom','Sarah','Tom','Max','Max','Jon');
earnings=[100 200 20;20 100 500;1 5 900; 100 200 200;200 200 0; 300 100 -250];

and now we want to calculate the column-sums for each name.
Okay, we can find out the indices by

[namesuq,i1,i2]=unique(names,'rows')

but after that, the obvious call

accumarray(i2,earning)

is not working. One could of course use a for-loop over the unique names or the rows, but that might be a little bit inefficient. Are there better ideas?
Additionally I tried

accumarray(i2,@(i)earnings(i,:))

but this is not implemented and results in

Error using accumarray
Second input VAL must be a full numeric, logical, or char vector or scalar.

Thanks for ideas.

Additions: Thanks to eitan-t for his solution which is great for the example.
So sad, my minimum working example did not show all needs: The function which I want to aplly needs a whole row or later maybe a complete maxtrix, which I need to group over a 3rd or even higher dimension.

Maybe to make that clearer: Think of a matrix M in size a x b x c and each entry in a corresponds to a name or so. My need is best described for example summing up for all unique names.
Naive programming would be

nam=unique(names);
for ind=1:size(nam,2)
    N(ind,:,:)=sum(M(nam(ind)==names,:,:),1);
end

Is this clear? Are there solutions herefor?

Upvotes: 1

Views: 1019

Answers (1)

Eitan T
Eitan T

Reputation: 32930

Based on this answer, the solution you're looking for is:

[namesuq, i1, i2] = unique(names, 'rows');
[c, r] = meshgrid(1:size(earnings, 2), i2);
y = accumarray([r(:), c(:)], earnings(:));

where the rows match names(i1, :).

Example

%// Sample input
names = str2mat('Tom', 'Sarah', 'Tom', 'Max', 'Max', 'Jon');
earnings = [100 200 20; 20 100 500; 1 5 900; 100 200 200; 200 200 0; 300 100 -250]

%// Sum along columns by names 
[namesuq, i1, i2] = unique(names, 'rows');
[c, r] = meshgrid(1:size(earnings, 2), i2);
y = accumarray([r(:), c(:)], earnings(:))

The resulting sums are:

y =
   300   100  -250
   300   400   200
    20   100   500
   101   205   920

which correspond to names(i1, :):

ans =
    Jon  
    Max  
    Sarah
    Tom  

Upvotes: 2

Related Questions