Reputation: 303
I have a numpy array which looks like this
array([ 1219, 1220, 2215, 2216, 3459, 3460, 4686, 4687, 5920,
5921, 7200, 7201, 8498, 8499, 9834, 9835, 10046, 11138,
11139, 12520, 12521, 12522, 13812, 13813, 14033, 15099, 15100,
16375, 16376, 17576, 17577, 18634, 18635, 19849, 19850])
And I want to delete the elements which are very close. For example I don't want both 2215 and 2216, I want to keep only the first one 2215. Or for the 4686 and 4687, I want to keep only 4686. How can I do it using only numpy commands?
Upvotes: 0
Views: 1189
Reputation: 765
One solution I came up with is to calculate the difference of the array, and remove those whose forward difference values are small. Taking advantage of the fact that your array is sorted, the following code works for me.
import numpy as np
arr = np.array([ 1219, 1220, 2215, 2216, 3459, 3460, 4686, 4687, 5920,
5921, 7200, 7201, 8498, 8499, 9834, 9835, 10046, 11138,
11139, 12520, 12521, 12522, 13812, 13813, 14033, 15099, 15100,
16375, 16376, 17576, 17577, 18634, 18635, 19849, 19850])
threshold = 1
diff = np.empty(arr.shape)
diff[0] = np.inf # always retain the 1st element
diff[1:] = np.diff(arr)
mask = diff > threshold
new_arr = arr[mask]
print(new_arr)
You can adjust the threshold value to play with different levels of tolerance.
Upvotes: 1