Reputation: 641
Given a list
A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
is there an easy way to group all those clusters where the difference between successive elements is smaller than 3?
That is, obtain something like:
[13, 15, 17, 19, 21], [58, 59, 61, 63], [78, 79, 81]
I was wondering whether there exists any built-in function, but I couldn't find anything similar. I was trying to figure it out using the groupby
of the itertools
, but I'm stuck. Thank you in advance.
Upvotes: 6
Views: 1346
Reputation: 71461
You can use itertools.groupby
:
import itertools
A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
new_a = [(A[i+1]-A[i], A[i]) for i in range(len(A)-1)]
a = [[a, [c for _, c in b]] for a, b in itertools.groupby(new_a, key=lambda x:x[0] < 3)]
final_groups = [a[i][-1]+[a[i+1][-1][0]] if a[i+1][-1][0] - a[i][-1][-1] < 3 else a[i][-1] for i in range(len(a)-1) if a[i][0]]
Output:
[[13, 15, 17, 18, 19, 21], [58, 59, 61, 63], [78, 79, 81]]
Upvotes: 2
Reputation: 40941
Based on your comment
The important thing is to "store elements until" the condition is satisfied
You could use itertools.takewhile
for this.
takewhile(predicate, iterable) --> takewhile object
Return successive entries from an iterable as long as the predicate evaluates to true for each entry.
This solution certainly has room for improvement, but the takeaway is the use of takewhile
class Grouper:
"""simple class to perform comparison when called, storing last element given"""
def __init__(self, diff):
self.last = None
self.diff = diff
def predicate(self, item):
if self.last is None:
return True
return abs(self.last - item) < self.diff
def __call__(self, item):
"""called with each item by takewhile"""
result = self.predicate(item)
self.last = item
return result
def group_by_difference(items, diff=3):
results = []
start = 0
remaining_items = items
while remaining_items:
g = Grouper(diff)
group = [*itertools.takewhile(g, remaining_items)]
results.append(group)
start += len(group)
remaining_items = items[start:]
return results
This gives you grouped items with singleton clusters.
[[1],
[6],
[13, 15, 17, 18, 19, 21],
[29],
[36],
[53],
[58, 59, 61, 63],
[78, 79, 81],
[102],
[114]]
Upvotes: 1
Reputation: 7186
Similarly to this answer, which asked for runs of the same number, one can use numpy.split
here:
import numpy as np
def plateaus(A, atol=3):
runs = np.split(A, np.where(np.abs(np.diff(A)) >= atol)[0] + 1)
return [list(x) for x in runs if len(x) > 1]
A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
print(plateaus(A))
[[13, 15, 17, 18, 19, 21], [58, 59, 61, 63], [78, 79, 81]]
Without filtering on the length, this gives you singleton clusters just like the itertools.takehwile
approach by @sytech.
Upvotes: 1
Reputation: 82765
This is one approach using an iteration.
Ex:
A = [1, 6, 13, 15, 17, 18, 19, 21, 29, 36, 53, 58, 59, 61, 63, 78, 79, 81, 102, 114]
res = []
temp = []
l = len(A)-1
for i,v in enumerate(A):
if i+1 > l:
break
if abs(v - A[i+1]) < 3:
temp.append(v)
else:
if temp:
temp.append(v)
res.append(temp)
temp = []
print(res)
Output:
[[13, 15, 17, 18, 19, 21], [58, 59, 61, 63], [78, 79, 81]]
Upvotes: 1