user3259201
user3259201

Reputation: 109

How to filter lists in a list?

I have, e.g., listy = [[1,-1,2], [3,-2,-4]]

I want to filter such that I get just the positive elements [[1,2], [3]]

This is easy to do with a loop and filter(lambda x: x > 0, listy), but I'd like to avoid a loop.

Upvotes: 1

Views: 3762

Answers (2)

John1024
John1024

Reputation: 113844

List comprehensions can be nested:

In [2]: [ [i for i in x if i>0] for x in listy ]
Out[2]: [[1, 2], [3]]

Regarding the choice between filter and list comprehension, Guido van Rossum, the BDFL, wrote that list comprehension is both clearer and faster;

filter(P, S) is almost always written clearer as [x for x in S if P(x)], and this has the huge advantage that the most common usages involve predicates that are comparisons, e.g. x==42, and defining a lambda for that just requires much more effort for the reader (plus the lambda is slower than the list comprehension).

Advanced Topic: Handling large datasets

If listy is large and your application permits, you may want to use generators in place of lists:

g = ( (i for i in x if i>0) for x in listy )

Upvotes: 4

abarnert
abarnert

Reputation: 365717

Your problem pretty much inherently requires a loop. You want to do something for each member of listy, the for is there even in English.

However, you can wrap the loop up inside a comprehension, or a call to map, etc. For example:

listy = map(lambda sublist: filter(lambda x: x>0, sublist), listy)
listy = [[x for x in sublist if x>0] for sublist in listy]

(Or, of course, the other two combinations of the above, a comprehension over filter, or a map over a comprehension.)

You can also do the looping lazily, so instead of wasting time up front creating the filtered lists, you just create iterators that will generate the positive values on demand:

itery = itertools.imap(lambda sublist: itertools.ifilter(lambda x: x>0, sublist), listy)
itery = ((x for x in sublist if x>0) for sublist in listy)

But that just means the looping happens later, when you iterate over each element of itery, rather than right away.

(And again, you can combine the above two ideas, e.g., a list of iterators or an iterator of lists, instead of an iterator of iterators or a list of lists.)

You can even hide the loop inside a NumPy element-wise operation by creating a 1D array of Python lists and calling vectorize(lambda lst: filter(lambda x: x>0, sublist)) on it. But the loop's still going to be there.


Of course you could also get around the loop by looping indirectly. For example, with recursion:

def filter_nonpositives(x):
    return x[:x[0]>0] + filter_nonpositives(x[1:])

Or even by unrolling the loop up to some maximum size:

def filter_nonpositives(x):
    result = []
    if len(x) == 0:
        return result
    if x[0] > 0:
        result.append(x[0])
    if len(x) == 1:
        return result
    if x[1] > 0:
        result.append(x[1])
    if len(x) == 2:
        return result
    if x[2] > 0:
        result.append(x[2])
    # ... repeat as far as you want
    return result

But either way, you're still effectively looping—and you'd be hard-pressed to find anyone who thinks either one is more Pythonic, or better.

Upvotes: 2

Related Questions