Hammad
Hammad

Reputation: 617

Find elements in list with difference of 1

Suppose I have a list like this:

lst = [1, 3, 4, 5, 8, 10, 14, 20, 21, 22, 23, 40, 47, 48] 

I need to extract the elements which have a difference of one. I need final output to look like this:

[[3, 4, 5], [20, 21, 22, 23], [47, 48]] 

Here is what I have tried so far for this particular problem which has yielded some progress but doesn't achieve 100% of what I need:

final_list = []
for i in range(len(lst)):
    sub_list = []
    for ii in range(i, len(lst)):
        prev_num = lst[ii-1]
        if lst[ii] - prev_num == 1:
#             print(lst[ii], end=",")
            sub_array.append(lst[ii])
        else:
            break
    if sub_list:
        final_list.append(sub_list)
print(final_list)

Output:

[[4, 5], [5], [21, 22, 23], [22, 23], [23], [48]]

Upvotes: 3

Views: 836

Answers (5)

blhsing
blhsing

Reputation: 106598

You can use itertools.groupby to group the items with a key function that subtracts each item value with an incremental counter, which would result in a fixed value for each group of consecutive integers:

from itertools import groupby, count

lst = [1, 3, 4, 5, 8, 10, 14, 20, 21, 22, 23, 40, 47, 48]
c = count()
final_list = [g for _, [*g] in groupby(lst, lambda t: t - next(c)) if len(g) > 1]

final_list would become:

[[3, 4, 5], [20, 21, 22, 23], [47, 48]]

EDIT: If you'd rather have better performance than more concise code, you can look at @MadPhysicist's answer, which avoids overhead incurred from calling generator functions, and @don'ttalkjustcode's answer, which avoids creating lists until a pair of consecutive numbers are found. Combine the two answers and you get a solution that avoids both types of overhead:

out = []
prev = float('inf')
for i in lst:
    if i - prev != 1:
        current = None
    elif current:
        current.append(i)
    else:
        current = [prev, i]
        out.append(current)
    prev = i

Sample timing using @don'ttalkjustcode's benchmark code:

2716 μs  Mad_Physicist
1554 μs  dont_talk_just_code
1284 μs  Mad_Physicist_x_dont_talk_just_code

Try it online!

Upvotes: 6

Mad Physicist
Mad Physicist

Reputation: 114320

You can step along the input list and start adding elements to an output list. Any time the difference is greater than 1, start a new output list. If the prior output is length 1, remove it:

out = []
current = []
prev = None
for i in lst:
    if prev is None or i - prev != 1:
        if len(current) > 1:
            out.append(current)
        current = []
    current.append(i)
    prev = i
if len(current) > 1:
    out.append(current)

Upvotes: 2

Alain T.
Alain T.

Reputation: 42143

You can go through the elements and either add them as a new group or add them to the last group depending on whether it is one more than the last one added. When done, remove the single element groups.

lst = [1, 3, 4, 5, 8, 10, 14, 20, 21, 22, 23, 40, 47, 48]

groups = []
for n in lst:
    if groups and n-1 == groups[-1][-1]:
        groups[-1].append(n)              # +1 element, add to last group
    else:
        groups.append([n])                # add as a new group
groups = [g for g in groups if len(g)>1]  # remove single element groups

print(groups)
[[3, 4, 5], [20, 21, 22, 23], [47, 48]]

Upvotes: 1

no comment
no comment

Reputation: 10230

A zip solution that adds each first pair of a streak as an inner list and then appends further elements of the streak:

out = []
inner = None
for x, y in zip(lst, lst[1:]):
    if y - x == 1:
        if inner:
            inner.append(y) 
        else:
            inner = [x, y]
            out.append(inner)
    else:
        inner = None

Benchmark with a random list 1000 as long as your example and with same density:

5172 μs  blhsing
2697 μs  Mad_Physicist
7667 μs  Osman_Mamun
1571 μs  dont_talk_just_code

Benchmark code (Try it online!):

from timeit import timeit
from itertools import groupby, count
from random import sample

def blhsing(lst):
    c = count()
    return [g for _, [*g] in groupby(lst, lambda t: t - next(c)) if len(g) > 1]

def Mad_Physicist(lst):
    out = []
    current = []
    prev = None
    for i in lst:
        if prev is None or i - prev > 1:
            if len(current) > 1:
                out.append(current)
            current = []
        current.append(i)
        prev = i
    if len(current) > 1:
        out.append(current)
    return out

def Osman_Mamun(lst):
    out = []
    for k, g in groupby(enumerate(lst), key=lambda i: i[1]-i[0]):
        temp = list(g)
        if len(temp) > 1:
            out.append([i[1] for i in temp])
    return out

def dont_talk_just_code(lst):
    out = []
    inner = None
    for x, y in zip(lst, lst[1:]):
        if y - x == 1:
            if inner:
                inner.append(y) 
            else:
                inner = [x, y]
                out.append(inner)
        else:
            inner = None
    return out


lst_example = [1, 3, 4, 5, 8, 10, 14, 20, 21, 22, 23, 40, 47, 48] 
funcs = blhsing, Mad_Physicist, Osman_Mamun, dont_talk_just_code

for _ in range(3):
    lst = sorted(sample(range(max(lst_example) * 1000), len(lst_example) * 1000))
    # lst = lst_example
    results = []
    for func in funcs:
        t = timeit(lambda: results.append(func(lst)), number=100) / 100
        print('%d μs ' % (t * 1e6), func.__name__)
    assert all(result == results[0] for result in results), (list(map(len, results)), results)
    print()

Upvotes: 2

Osman Mamun
Osman Mamun

Reputation: 2882

Using itertools.groupby:

In [19]: out = []

In [20]: for k, g in groupby(enumerate(lst), key=lambda i: i[1]-i[0]):
    ...:     temp = list(g)
    ...:     if len(temp) > 1:
    ...:         out.append([i[1] for i in temp])
    ...:

In [21]: out
Out[21]: [[3, 4, 5], [20, 21, 22, 23], [47, 48]]

Upvotes: 1

Related Questions