Landon G
Landon G

Reputation: 839

How to group elements of a list that are within n of each other

I have two lists:

list_1 = []
list_2 = [1.0, 3.0, 3.15, 1.03, 6.0, 7.0]

And I want to sort through this list and merge elements that are within (in this case) 0.15 of each other.

So by the end of this, list_1 with contain the following values:

[[1.0, 1.03],[3.0, 3.15]]

Because 1.0, 1.03 were within 0.15 of each other and 3.0, 3.15 were also within 0.15 of each other.

This can also be more than just pairs, so for instance if I had 3.16, that is within range of 3.15, so it would be added to the group, ie:

list_2 = [1.0, 3.0, 3.15, 1.03, 6.0, 7.0, 3.16]

outputs:

[[1.0,1.03],[3.0,3.15,3.16]]

How can I do this?

Thanks for the help!

Upvotes: 1

Views: 110

Answers (2)

Alain T.
Alain T.

Reputation: 42133

To split the list on gaps of a given size, you can use zip to compare elements with their neighbours, giving the position of breaks in the sequence. Then zip again to turn these break positions into ranges of the original data.

data = [1, 2, 5, 6, 7, 9, 22, 24, 26, 29] 
gap  = 2

breaks  = [ i for i,(a,b) in enumerate(zip(data,data[1:]),1) if abs(a-b) > gap ]
result  = [ data[s:e] for s,e in zip([0]+breaks,breaks+[len(data)]) ]

print(result)
[[1, 2], [5, 6, 7, 9], [22, 24, 26], [29]]

Note that this will work on unsorted lists as well

The technique can be generalized in a function that will split any list on any given condition:

def splitList(A,condition):
    breaks = [i for i,(a,b) in enumerate(zip(A,A[1:]),1) if condition(a,b)]
    return [A[s:e] for s,e in zip([0]+breaks,breaks+[len(A)])]


data = [1, 2, 5, 6, 7, 9, 22, 24, 26, 29] 
gap=2

result = splitList(data,lambda a,b: abs(a-b)>gap)
print(result)
[[1, 2], [5, 6, 7, 9], [22, 24, 26], [29]]

data   = [1, 2, 5, 6, 4,2,10,15,14,7,9,12]
ascend = splitList(data,lambda a,b: a>b) # split ascending streaks
print(ascend)
[[1, 2, 5, 6], [4], [2, 10, 15], [14], [7, 9, 12]]

Upvotes: 0

wim
wim

Reputation: 362857

networkx is overkill here. Just sort the list first, then iterate and yield off a chunk when the difference between previous and current is larger than your delta.

>>> list_2 = [1.0, 3.0, 3.15, 1.03, 6.0, 7.0, 3.16]
>>> list_2.sort()
>>> delta = 0.15
>>> list_1 = []
>>> prev = -float('inf')
>>> for x in list_2:
...     if x - prev > delta:
...         list_1.append([x])
...     else:
...         list_1[-1].append(x)
...     prev = x
...
>>> list_1
[[1.0, 1.03], [3.0, 3.15, 3.16], [6.0], [7.0]]
>>> [x for x in list_1 if len(x) > 1]
[[1.0, 1.03], [3.0, 3.15, 3.16]]

Upvotes: 2

Related Questions