Jette
Jette

Reputation: 51

Split a list in sublists based on the difference between consecutive values

I have a list with values for which each value has at least one (but often more) consecutive value(s) that have a .033 increment:

l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]

I would like to split this list into sublists where consecutive items that differ by .033 are combined, and when the difference is larger to start a new sublist:

l = [ [26.051, 26.084, 26.117, 26.15, 26.183], [31.146, 31.183], [34.477, 34.51, 34.543] ] 

Upvotes: 4

Views: 1994

Answers (4)

Eugene Yarmash
Eugene Yarmash

Reputation: 149893

If you're a fan of itertools, you could use itertools.groupby() for this:

from itertools import groupby

l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]

def keyfunc(x):
    return (x[0] > 0 and round(l[x[0]] - l[x[0]-1], 3) == 0.033 or
            x[0] < len(l) - 1 and round(l[x[0]+1] - l[x[0]], 3) == 0.033)

print([[x[1] for x in g] for k, g in groupby(enumerate(l), key=keyfunc)])

Output:

[[26.051, 26.084, 26.117, 26.15, 26.183], [31.146, 31.183], [34.477, 34.51, 34.543]]

As far as the logic is concerned, the key function returns different keys for numbers that have neighbors with the difference of 0.033 and those that don't. Then groupby() groups them accordingly.

Upvotes: 1

Hai Vu
Hai Vu

Reputation: 40733

My approach involves running through pairs of consecutive numbers and examine the gaps between them, just like everybody else's. The difference here is in the use of iter() to create two iterables from one list.

# Given:
l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]
gap = 0.033

# Make two iterables (think: virtual lists) from one list
previous_sequence, current_sequence = iter(l), iter(l)

# Initialize the groups while advancing current_sequence by 1
# element at the same time
groups = [[next(current_sequence)]]

# Iterate through pairs of numbers
for previous, current in zip(previous_sequence, current_sequence):
    if abs(previous - current) > gap:
        # Large gap, we create a new empty sublist
        groups.append([])

    # Keep appending to the last sublist
    groups[-1].append(current)

print(groups)

A few notes

  • My solution looks long, but if you subtract all the comments, blank likes, and the last print statement, it is only 6 lines
  • It is efficient because I did not actually duplicate the list
  • An empty list (empty l) will generate a StopIteration exception, so please ensure the list is not empty

Upvotes: 1

rnso
rnso

Reputation: 24565

One can use temporary lists and for loop to get the desired result:

l = [26.051, 26.084, 26.117, 26.15, 26.183, 31.146, 31.183, 34.477, 34.51, 34.543]
outlist = []
templist = [l.pop(0)]
while len(l)>0:
    x = l.pop(0)
    if x - templist[-1] > 0.04:
        outlist.append(templist)
        templist = [x]
    else: 
        templist.append(x)
outlist.append(templist)
print(outlist)

Output:

[[26.051, 26.084, 26.117, 26.15, 26.183], [31.146, 31.183], [34.477, 34.51, 34.543]]

Upvotes: 2

tobias_k
tobias_k

Reputation: 82929

Keep track of the last element you saw and either append the current item to the last sublist, or create a new sublist if the difference is greater than your allowed delta.

res, last = [[]], None
for x in l:
    if last is None or abs(last - x) <= 0.033:
        res[-1].append(x)
    else:
        res.append([x])
    last = x

Note, however, that a value of 0.033 will in fact not return the result that you want, as some of the differences are considerably more (0.037) or just slightly more due to floating point rounding. Instead, you might want to use a slightly more generous value, e.g., using 0.035 gives you [[26.051, 26.084, 26.117, 26.15, 26.183], [31.146], [31.183], [34.477, 34.51, 34.543]]

Upvotes: 5

Related Questions