Fracedo
Fracedo

Reputation: 626

How to efficiently get minimum and maximum values from each duplicate row in a matrix

I need to generate matrix that explained below. How do I efficiently get the minimum and maximum value from each duplicate row in a matrix?

Let say I have the following data as a matrix:

A = 

    x        y      z 

   -250    -60    -60
   -250    -60    -59
   -250    -60    -58
   -250    -60     58
   -250    -60     59
   -250    -60     60
   -250    -59    -60
   -250    -59    -59
   -250    -59    -58
     .      .      .
     .      .      .
     .      .      .
   -250     59     58
   -250     59     59
   -250     59     60
   -250     60     58
   -250     60     59
   -250     60     60

By duplicate rows, I mean multiple rows of same values of x and y. Now I want min and max of z from each duplicate values so that I can generate a following matrix

      x        y        z 


   -250    -60     abs(min(z) - max(z)) % for all x = -250 and y = -60
   -250    -59     abs(min(z) - max(z)) % for all x = -250 and y = -59

How can I efficiently generate this matrix?

Upvotes: 2

Views: 190

Answers (2)

rayryeng
rayryeng

Reputation: 104484

This is a classic case of using accumarray combined with unique. The added benefit with this approach is that you are not limited to just R2015b or greater, even though the answer is an equally good approach to solving the problem.

Use unique to find all unique rows for the x and y columns simultaneously. You'll want the first output to find these unique rows and the third output to assign each row a unique integer ID that gives you which group the row belongs to. You can then use accumarray with a custom mapping function that finds the difference between the largest and smallest values for set of values that belong to the same group.

[Au, ~, ID] = unique(A(:,1:2), 'rows');
out = accumarray(ID, A(:,3), [], @(z) abs(min(z) - max(z)));
out = [Au out];

The first line of code finds all unique rows using the first two columns of A. The first output Au gives us a matrix of only the unique rows found in A for the first two columns. The third output assigns an integer ID to each row for the first two columns. This provides a mapping such that rows that are the same get assigned to the same group. We finally use accumarray where we use the IDs that provide the grouping and we use the third column so that all values in the third column that get mapped to the same ID, we apply something to that collection of values. In our case, we apply your custom function of abs(min(z) - max(z)) where z is an array that groups all of the third column values belonging to the same ID.

We finally concatenate the unique rows of the matrix with the output of that accumarray gives us to show your expected output. Given your rather limited example, we get the following output in MATLAB:

>> out

out =

  -250   -60   120
  -250   -59     2
  -250    59     2
  -250    60     2

We can verify that this works because for the group mapped to x = -250, y = -60, the smallest and largest values are -60 and 60, so the absolute difference between these elements is 120. A similar procedure can be done for the rest of the unique rows.

Upvotes: 5

craigim
craigim

Reputation: 3914

If you are using R2015b or greater, use findgroups and splitapply:

[G, IDx, IDy] = findgroups(A(:, 1), A(:, 2));
Range = splitapply(@(z) abs(min(z) - max(z)), A(:, 3), G);

NewMatrix = [IDx, IDy, Range];

findgroups does just that, it finds all groups that have unique combinations of the inputs. It returns a vector with the group numbers, and it optionally returns the unique values for each group (IDx and IDy, above).

splitapply then applies any function to the values in each group. So the function will be applied to everything in "group 1", then "group 2", etc., similar to how cellfun applies the same function to every element in a cell array.

Upvotes: 3

Related Questions