Brethlosze
Brethlosze

Reputation: 1622

Pivot Table with Additional Columns

Let's have a 2D double array such as:

% Data: ID, Index, Weight, Category
A0=[1 1121 204 1;...
    2 2212 112 1;...
    3 2212 483 3;...
    4 4334 233 1;...
    5 4334 359 2;...
    6 4334 122 3 ];

I am needing to pivot / group by the rows with the highest Weights, for each given Index, which can be achieved with any Pivot Table | Group By functionality (such as pivottable, SQL GROUP BY or MS Excel PivotTable)

% Current Result
A1=pivottable(A0,[2],[],[3],{@max}); % Pivot Table
A1=cell2mat(A1); % Convert to array

>>A1=[1121 204;...
      2212 483;...
      4334 359 ]

How should I proceed if I need to recover also the ID and the Category columns?

% Required Result
>>A1=[1 1121 204 1;...
      3 2212 483 3;...
      5 4334 359 2 ];

The syntax is Matlab, but a solution involving other languages (Java, SQL) can be acceptable, due they can be transcribed into Matlab.

Upvotes: 1

Views: 120

Answers (1)

Luis Mendo
Luis Mendo

Reputation: 112759

You can use splitapply with an anonymous function as follows.

grouping_col = 2; % Grouping column
maximize_col = 3; % Column to maximize 
[~, ~, group_label] = unique(A0(:,grouping_col));
result = splitapply(@(x) {x(x(:,maximize_col)==max(x(:,maximize_col)),:)}, A0, group_label);
result = cell2mat(result); % convert to matrix

How it works: the anonymous function @(x) {x(x(:,maximize_col)==max(···),:)} is called by splitapply once for each group. The function is provided as input a submatrix containing all rows with the same value of the column with index grouping_col. What this function then does is keep all rows that maximize the column with index maximize_col, and pack that into a cell. The result is then converted to matrix form by cell2mat.


With the above solution, if there are several maximizing rows for each group all of them are produced. To keep only the first one, replace the last line by

result = cell2mat(cellfun(@(c) c(1,:), result, 'uniformoutput', false));

How it works: this uses cellfun to apply the anonymous function @(c) c(1,:) to the content of each cell. The function simply keeps the first row. Alternatively, to keep the last row use @(c) c(end,:). The result is then converted to matrix form using cell2mat again.

Upvotes: 2

Related Questions