Reputation: 23
I would like to know how to aggregate data based on the first column. Here's an exmple:
My current data in matrix:
c1
1 1 2 3 4 5 6
1 1 2 3 4 5 6
1 1 2 3 4 5 6
2 1 2 3 4 5 6
3 1 2 3 4 5 6
3 1 2 3 4 5 6
result data:
c1
1 3 6 9 12 15 18
2 1 2 3 4 5 6
3 2 4 6 8 10 12
Do you know any good functions?
Upvotes: 0
Views: 78
Reputation: 171
If there is only a few unique numbers in the first column of the input data, then, I would prefer following code
c1 = [1 1 2 3 4 5 6
1 1 2 3 4 5 6
1 1 2 3 4 5 6
2 1 2 3 4 5 6
3 1 2 3 4 5 6
3 1 2 3 4 5 6];
uidx= unique(c1(:,1));
c2= zeros(length(uidx), size(c1,2));
for ii = 1:length(uidx)
idx = (c1(:,1)==uidx(ii));
c2(ii,:) = [uidx(ii) sum(c1(idx,2:end),1)];
end
disp(c2)
1 3 6 9 12 15 18
2 1 2 3 4 5 6
3 2 4 6 8 10 12
Upvotes: 0
Reputation: 10792
Here is another approach using linear algeabra:
% Example data
c1 = [1 1 2 3 4 5 6
1 1 2 3 4 5 6
1 1 2 3 4 5 6
2 1 2 3 4 5 6
3 1 2 3 4 5 6
3 1 2 3 4 5 6]
% We split the c1 matrix in two parts and calculate its size along the first dimension
A = c1(:,1);
B = c1(:,2:end);
n = size(c1,1);
% We can use sparse(A,1:n,1) to create our matrix:
% sparse([1 1 1 0 0 0
% 0 0 0 1 0 0
% 0 0 0 0 1 1])
% Then using a matrix multiplication we obtain the result:
res = sparse(A,1:n,1)*B
%If you want to preserve the indices
%res= [[1:max(A(:,1))].',sparse(A,1:n,1)*B]
Upvotes: 1
Reputation: 112689
You can use unique
and splitapply
as follows:
c1 = [1 1 2 3 4 5 6
1 1 2 3 4 5 6
1 1 2 3 4 5 6
2 1 2 3 4 5 6
3 1 2 3 4 5 6
3 1 2 3 4 5 6]; % data
[u, ~, w] = unique(c1(:,1)); % u: unique elements; w: integer that indicates the group
result = [u splitapply(@(x)sum(x,1), c1(:,2:end), w)]; % sum over 1st dim within a group
Upvotes: 2
Reputation: 18905
Here's my approach:
Find all unique indices from the first column using unique
.
Then I use arrayfun
to sum all rows for a specific index. Sometimes, arrayfun
is perceived as some loop in disguise, so looping would be fine, too. (See also sum
and find array elements that meet a condition.)
At last, I set up the desired output format.
Please see the following code:
% Input data.
c1 = [
1 1 2 3 4 5 6
1 1 2 3 4 5 6
1 1 2 3 4 5 6
2 1 2 3 4 5 6
3 1 2 3 4 5 6
3 1 2 3 4 5 6
]
% Get unique indices from first column.
idx = unique(c1(:, 1));
% For all unique indices calculate sum over all rows with specified index.
result = arrayfun(@(x) sum(c1((c1(:, 1) == x), 2:end), 1), idx, 'UniformOutput', false);
% Set up proper output.
result = [idx, vertcat(result{:})]
c1 =
1 1 2 3 4 5 6
1 1 2 3 4 5 6
1 1 2 3 4 5 6
2 1 2 3 4 5 6
3 1 2 3 4 5 6
3 1 2 3 4 5 6
result =
1 3 6 9 12 15 18
2 1 2 3 4 5 6
3 2 4 6 8 10 12
Hope that helps!
Upvotes: 1