CyberMathIdiot
CyberMathIdiot

Reputation: 303

How to delete very close values to a numpy array?

I have a numpy array which looks like this

array([ 1219,  1220,  2215,  2216,  3459,  3460,  4686,  4687,  5920,
        5921,  7200,  7201,  8498,  8499,  9834,  9835, 10046, 11138,
       11139, 12520, 12521, 12522, 13812, 13813, 14033, 15099, 15100,
       16375, 16376, 17576, 17577, 18634, 18635, 19849, 19850])

And I want to delete the elements which are very close. For example I don't want both 2215 and 2216, I want to keep only the first one 2215. Or for the 4686 and 4687, I want to keep only 4686. How can I do it using only numpy commands?

Upvotes: 0

Views: 1189

Answers (1)

Yang Yushi
Yang Yushi

Reputation: 765

One solution I came up with is to calculate the difference of the array, and remove those whose forward difference values are small. Taking advantage of the fact that your array is sorted, the following code works for me.

import numpy as np

arr = np.array([ 1219,  1220,  2215,  2216,  3459,  3460,  4686,  4687,  5920,
    5921,  7200,  7201,  8498,  8499,  9834,  9835, 10046, 11138,
    11139, 12520, 12521, 12522, 13812, 13813, 14033, 15099, 15100,
    16375, 16376, 17576, 17577, 18634, 18635, 19849, 19850])

threshold = 1
diff = np.empty(arr.shape)
diff[0] = np.inf  # always retain the 1st element
diff[1:] = np.diff(arr)
mask = diff > threshold

new_arr = arr[mask]

print(new_arr)

You can adjust the threshold value to play with different levels of tolerance.

Upvotes: 1

Related Questions