Reputation: 23

Aggregating data based on first column values

I would like to know how to aggregate data based on the first column. Here's an exmple:

My current data in matrix:

c1
1  1 2 3 4 5 6
1  1 2 3 4 5 6
1  1 2 3 4 5 6
2  1 2 3 4 5 6
3  1 2 3 4 5 6
3  1 2 3 4 5 6

result data:

c1
1  3 6 9 12 15 18
2  1 2 3 4  5  6
3  2 4 6 8  10 12

Do you know any good functions?

Upvotes: 0

Answers (4)

user3862410

Reputation: 171

If there is only a few unique numbers in the first column of the input data, then, I would prefer following code

c1 = [1  1 2 3 4 5 6
1  1 2 3 4 5 6
1  1 2 3 4 5 6
2  1 2 3 4 5 6
3  1 2 3 4 5 6
3  1 2 3 4 5 6];

uidx= unique(c1(:,1));

c2= zeros(length(uidx), size(c1,2));

for ii = 1:length(uidx)
   idx = (c1(:,1)==uidx(ii));
   c2(ii,:) = [uidx(ii) sum(c1(idx,2:end),1)];
end

disp(c2)
     1     3     6     9    12    15    18
     2     1     2     3     4     5     6
     3     2     4     6     8    10    12

Upvotes: 0

obchardon

Reputation: 10792

Here is another approach using linear algeabra:

% Example data
c1  = [1 1 2 3 4 5 6
       1 1 2 3 4 5 6
       1 1 2 3 4 5 6
       2 1 2 3 4 5 6
       3 1 2 3 4 5 6
       3 1 2 3 4 5 6]

% We split the c1 matrix in two parts and calculate its size along the first dimension
A = c1(:,1);
B = c1(:,2:end);
n = size(c1,1);

% We can use sparse(A,1:n,1) to create our matrix:
% sparse([1 1 1 0 0 0
%         0 0 0 1 0 0
%         0 0 0 0 1 1])
% Then using a matrix multiplication we obtain the result:
res  = sparse(A,1:n,1)*B

%If you want to preserve the indices
%res= [[1:max(A(:,1))].',sparse(A,1:n,1)*B]

Upvotes: 1

Luis Mendo

Reputation: 112689

You can use unique and splitapply as follows:

c1 = [1  1 2 3 4 5 6
      1  1 2 3 4 5 6
      1  1 2 3 4 5 6
      2  1 2 3 4 5 6
      3  1 2 3 4 5 6
      3  1 2 3 4 5 6]; % data
[u, ~, w] = unique(c1(:,1)); % u: unique elements; w: integer that indicates the group
result = [u splitapply(@(x)sum(x,1), c1(:,2:end), w)]; % sum over 1st dim within a group

Upvotes: 2

HansHirse

Reputation: 18905

Here's my approach:

Find all unique indices from the first column using unique.
Then I use arrayfun to sum all rows for a specific index. Sometimes, arrayfun is perceived as some loop in disguise, so looping would be fine, too. (See also sum and find array elements that meet a condition.)
At last, I set up the desired output format.

Please see the following code:

% Input data.
c1 = [
  1 1 2 3 4 5 6
  1 1 2 3 4 5 6
  1 1 2 3 4 5 6
  2 1 2 3 4 5 6
  3 1 2 3 4 5 6
  3 1 2 3 4 5 6
]

% Get unique indices from first column.
idx = unique(c1(:, 1));

% For all unique indices calculate sum over all rows with specified index.
result = arrayfun(@(x) sum(c1((c1(:, 1) == x), 2:end), 1), idx, 'UniformOutput', false);

% Set up proper output.
result = [idx, vertcat(result{:})]


c1 =
   1   1   2   3   4   5   6
   1   1   2   3   4   5   6
   1   1   2   3   4   5   6
   2   1   2   3   4   5   6
   3   1   2   3   4   5   6
   3   1   2   3   4   5   6

result =
    1    3    6    9   12   15   18
    2    1    2    3    4    5    6
    3    2    4    6    8   10   12

Hope that helps!

Upvotes: 1

Aggregating data based on first column values

Answers (4)

Related Questions