Find the index of n unique rows with the lowest values

I have a 151-by-151 matrix A. It's a correlation matrix, so there are 1s on the main diagonal and repeated values above and below the main diagonal.

I'm looking for a way to obtain the indices of n many rows with the lowest values.

The number of rows I want to find is fixed at n, and the n many rows must be unique.

So, for example let's say that:

n = 10

and data as follows

Part of the matrix

Row 1 is involved in the lowest correlation (with row/column 6), and also the third lowest correlation (with row/column 9).

That means I've found the indices of three of the rows I need: 1, 6, and 9. However, I don't count row 1 twice, and hence I still need to find the indices of 7 more rows.

I've tried the approach

function [smallestNElements smallestNIdx] = getNElements(A, n)
     [ASorted AIdx] = sort(A);
     smallestNElements = ASorted(1:n);
     smallestNIdx = AIdx(1:n);
end

which I obtained here. However, I think that approach is fundamentally inapplicable because it's meant to apply to a vector. When I apply it to my 2D matrix it just gives the index of the lowest values in the first column.

By lowest I mean lowest in absolute terms and not "closest to zero". Thus -0.9 is lower than -0.1, which in turn is lower than 0.05.

Upvotes: 0

Views: 77

Answers (2)

Matt
Matt

Reputation: 2802

Here is my solution

[~, idx] = sort(sum(A));
results = idx(1:n);

How this works is the sum is taken over the columns of a matrix. This leaves me with a 1 x columns vector. The vector is sorted smallest to largest and the indices are retained. Then the first n number of indices are kept as the result. Here is the result using your data for n=4

result = 
    1
    6
    9
    8

Upvotes: 1

Dan
Dan

Reputation: 45752

First, get rid of the repeated values by making them Inf so that they won't be mistaken for being the lowest values:

A1 = tril(A);
A1(A1==0) = Inf;

now find the indices of the n smallest:

[~,idx] = sort(A1(:));
[r,c] = ind2sub(size(A), idx(1:n));

That finds the lowest n correlation, if you want the n rows involved in the lowest correlations without repeating them then

[~,idx] = sort(A(:));
[r,c] = ind2sub(size(A), idx);
rows = unique(r,'stable');
result = rows(1:n)

Upvotes: 4

Related Questions