Developer
Developer

Reputation: 178

numpy.ndarray sorting to return indices

errors = [[ 0.,  9., 12.,  9., 14.,  5.,  4., 10.,  8.,  8.,  6.,  5.,  9.],
   [ 9.,  0., 22., 16., 11., 12.,  9., 21., 14., 11., 16., 15.,  9.],
   [12., 22.,  0., 18., 23., 16., 10., 22., 21., 13., 13., 13., 15.],
   [ 9., 16., 18.,  0., 11., 12.,  8., 19., 20., 11.,  7.,  9., 13.],
   [14., 11., 23., 11.,  0., 11.,  7., 18.,  9., 10.,  7.,  7., 14.],
   [ 5., 12., 16., 12., 11.,  0.,  7., 13., 15.,  5.,  8., 10.,  9.],
   [ 4.,  9., 10.,  8.,  7.,  7.,  0.,  8.,  8.,  3.,  4.,  7.,  4.],
   [10., 21., 22., 19., 18., 13.,  8.,  0., 18., 12., 14., 13., 11.],
   [ 8., 14., 21., 20.,  9., 15.,  8., 18.,  0.,  5., 11., 16., 10.],
   [ 8., 11., 13., 11., 10.,  5.,  3., 12.,  5.,  0.,  8.,  9.,  5.],
   [ 6., 16., 13.,  7.,  7.,  8.,  4., 14., 11.,  8.,  0., 11.,  7.],
   [ 5., 15., 13.,  9.,  7., 10.,  7., 13., 16.,  9., 11.,  0.,  4.],
   [ 9.,  9., 15., 13., 14.,  9.,  4., 11., 10.,  5.,  7.,  4.,  0.]])

Above is numpy.ndarray of the shape (13,13) with errors obtaining in a certain classification task using two of the 13 features.

The task here is to find smallest achievable error AND the pair of features which achieves this minimum error

Because the data are few the smallest error can be seen by eyes its 3 and the pair of features is either (6,9) or (9,6).

(The diagnal line with 0 values is feature with itself so it was not included).

I have tried to do it with argsort but it only sort each row separately and I did not arrive at the answer.

Please assist.

Upvotes: 0

Views: 50

Answers (2)

tschermak
tschermak

Reputation: 45

If I understand your task correctly:

  • find the minimum value in your errors-array
  • find the indices of this minimal error

I achieved that using min() and nonzero(). Here's my solution to your problem:

errors = np.array((
   [ 0.,  9., 12.,  9., 14.,  5.,  4., 10.,  8.,  8.,  6.,  5.,  9.],
   [ 9.,  0., 22., 16., 11., 12.,  9., 21., 14., 11., 16., 15.,  9.],
   [12., 22.,  0., 18., 23., 16., 10., 22., 21., 13., 13., 13., 15.],
   [ 9., 16., 18.,  0., 11., 12.,  8., 19., 20., 11.,  7.,  9., 13.],
   [14., 11., 23., 11.,  0., 11.,  7., 18.,  9., 10.,  7.,  7., 14.],
   [ 5., 12., 16., 12., 11.,  0.,  7., 13., 15.,  5.,  8., 10.,  9.],
   [ 4.,  9., 10.,  8.,  7.,  7.,  0.,  8.,  8.,  3.,  4.,  7.,  4.],
   [10., 21., 22., 19., 18., 13.,  8.,  0., 18., 12., 14., 13., 11.],
   [ 8., 14., 21., 20.,  9., 15.,  8., 18.,  0.,  5., 11., 16., 10.],
   [ 8., 11., 13., 11., 10.,  5.,  3., 12.,  5.,  0.,  8.,  9.,  5.],
   [ 6., 16., 13.,  7.,  7.,  8.,  4., 14., 11.,  8.,  0., 11.,  7.],
   [ 5., 15., 13.,  9.,  7., 10.,  7., 13., 16.,  9., 11.,  0.,  4.],
   [ 9.,  9., 15., 13., 14.,  9.,  4., 11., 10.,  5.,  7.,  4.,  0.]))

min_error = errors[errors!=0].min()
pairs = np.nonzero(errors == errors[errors!=0].min())

This gives me the expected outputs min_error = 3 and pairs = (array([6, 9]), array([9, 6])).

Upvotes: 1

aparpara
aparpara

Reputation: 2201

Strictly speaking, you need to mask out not zero, but diagonal values. Who knows, may be some pair will give a perfect match. So I'd do it like this:

# This modifies errors filling the diagonal with inf-s. Make a copy if you need to keep the original result intact.
np.fill_diagonal(errors, np.inf)
np.nonzero(errors == errors.min())

Upvotes: 1

Related Questions