Arman Mojaver
Arman Mojaver

Reputation: 451

list comprehensions with break

I have the following code that I would like to write in one line with a list comprehension.

list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56]

list3 = []
for i in list1:
    for j in list2:
        if j>i:
            # print(i,j)
            list3.append(j)
            break
print(list1)
print(list3)

The output is:

[4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
[5, 7, 7, 11, 11, 17, 24, 24, 26, 56]

It's the break statement that throws me off, I don't know where to put it.

Thank you

Upvotes: 1

Views: 212

Answers (4)

Alain T.
Alain T.

Reputation: 42133

You can't really break a list comprehension's internal for loop, what you can do is avoid having to break it at all by using the next function to find the first occurrence of a matching value:

list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56]
list3 = [ next(j for j in list2 if j>i) for i in list1 ]

output:

print(list1)
print(list3)
[4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
[5, 7, 7, 11, 11, 17, 24, 24, 26, 56]

If you are concerned about performance (since the list comprehension will be slower than the loops), you could use a bisecting search in list 2 to find the next higher value:

from bisect import bisect_left
list3 = [ list2[bisect_left(list2,i+1)] for i in list1 ]

This assumes that list2 is sorted in ascending order and that max(list2) > max(list1)

Upvotes: 1

water_ghosts
water_ghosts

Reputation: 736

You could move the break logic into a separate function, then put that function inside a list comprehension:

def smallest_value_larger_than_i(candidate_values, i):
    for value in candidate_values:
        if value > i:
            return value
    return None  # Not sure how you want to handle this case

list3 = [smallest_value_larger_than_i(list2, i) for i in list1]

This runs slightly slower than your original solution, but if the goal of using a list comprehension is speed, you'll get much better results by improving the algorithm instead. For example, if both lists are sorted, then you can discard elements from list2 as soon as you skip over them once, instead of checking them against the rest of list1. You could also do a binary search of list2 instead of scanning through it linearly.

Upvotes: 0

Arman Mojaver
Arman Mojaver

Reputation: 451

I have tried timing the answer posted by AbbeGijly.

It turns out that it is slower than the original solution. Check it out.

import timeit

print(timeit.timeit('''
list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 40, 56]

list3 = []
for i in list1:
    for j in list2:
        if j>i:
            # print(i,j)
            list3.append(j)
            break
'''))

print(timeit.timeit('''
list1 = [4, 5, 6, 9, 10, 16, 21, 23, 25, 27]
list2 = [1, 3, 5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 40, 56]
list4 = [[j for j in list2 if j > i] for i in list1]
'''))

The output is:

3.6144596
8.731578200000001

Upvotes: 0

AbbeGijly
AbbeGijly

Reputation: 1211

To build the expression it helps to ignore the break condition at first:

In [32]: [[j for j in list2 if j > i] for i in list1]                                       
Out[32]: 
[[5, 7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
 [7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
 [7, 8, 11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
 [11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
 [11, 12, 13, 14, 15, 17, 20, 24, 26, 56],
 [17, 20, 24, 26, 56],
 [24, 26, 56],
 [24, 26, 56],
 [26, 56],
 [56]]

From there you can add the min constraint:

In [33]: [min([j for j in list2 if j > i]) for i in list1]                                  
Out[33]: [5, 7, 7, 11, 11, 17, 24, 24, 26, 56]

Upvotes: 2

Related Questions