matman9
matman9

Reputation: 490

Removing close values from array

I have an array of values

array = [100, 101, 102, 102.001, 103.2, 104.64, 106.368, 108.442]

Values 102 and 102.001 should be the same. I'd like to find the most best way to remove the value 102.001 and not 102.

So far I have a cumbersome method to do this, but this would remove 102 if the array was reversed;

import numpy as np
array = [100, 101, 102, 102.001, 103.2, 104.64, 106.368, 108.442]
array_diff = np.ediff1d(array)
ai = np.array(np.where(array_diff<0.01))
array_out = np.delete(array,[ai[0][0]+1])

Is there a way to merge/remove values given a tolerance?

Thanks in advance.

Upvotes: 1

Views: 885

Answers (2)

slavny_coder
slavny_coder

Reputation: 180

Solution for simpler cases

def drop_close_values(values, tolerance):
    ret = list()
    last_added = None
    for i in sorted(values):
        if (last_added is None) or (last_added + tolerance < i):
            ret.append(i)
            last_added = i
    return ret

x = [0, 1, 2, 3, 5, 10, 20, 45]
drop_close_values(x, 4)

Returns: [0, 5, 10, 20, 45]

Upvotes: 0

Sam Mason
Sam Mason

Reputation: 16194

a vanilla python solution:

from itertools import groupby

def nearby_groups(arr, tol_digits=2):
  # split up sorted input array into groups if they're similar enough
  for (_, grp) in groupby(arr, lambda x: round(x, tol_digits)):
    # yield value from group that is closest to an integer
    yield sorted(grp, key=lambda x: abs(round(x) - x))[0]


array = [100, 101, 101.999, 102.001, 102, 103.2, 104.64, 106.368, 108.442]

print(list(nearby_groups(array)))

gives:

[100, 101, 102, 103.2, 104.64, 106.368, 108.442]

this solution assumes the input is pre-sorted.

Upvotes: 1

Related Questions