Marcel Wilson
Marcel Wilson

Reputation: 4572

Pythonic list comprehension possible with this loop?

I have a love/hate relationship with list comprehension. On the one hand I think they are neat and elegant. On the other hand I hate reading them. (especially ones I didn't write) I generally follow the rule of, make it readable until speed is required. So my question is really academic at this point.

I want a list of stations from a table who's strings often have extra spaces. I need those spaces stripped out. Sometimes those stations are blank and should not be included.

stations = []
for row in data:
    if row.strip():
        stations.append(row.strip())

Which translates to this list comprehension:

stations = [row.strip() for row in data if row.strip()]

This works well enough, but it occurs to me that I'm doing strip twice. I guessed that .strip() was not really needed twice and is generally slower than just assigning a variable.

stations = []
for row in data:
    blah = row.strip()
    if blah:
        stations.append(blah)

Turns out I was correct.

> Striptwice list comp 14.5714301669     
> Striptwice loop 17.9919670399
> Striponce loop 13.0950567955

Timeit shows between the two loop segments, the 2nd (strip once) is faster. No real surprise here. I am surprised that list comprehension is only marginally slower even though it's doing a strip twice.

My question: Is there a way to write a list comprehension that only does the strip once?



Results:

Here are the timing results of the suggestions

# @JonClements & @ErikAllik
> Striptonce list comp 10.7998494348
# @adhie
> Mapmethod loop 14.4501044569

Upvotes: 15

Views: 1008

Answers (3)

Erik Kaplun
Erik Kaplun

Reputation: 38217

Nested comprehensions can be tricky to read, so my first preference would be:

stripped = (x.strip() for x in data)
stations = [x for x in stripped if x]

Or, if you inline stripped, you get a single (nested) list comprehension:

stations = [x for x in (x.strip() for x in data) if x]

Note that the first/inner comprehension is a actually generator expression, which, in other words is a lazy list comprehension; this is to avoid iterating twice.

Upvotes: 13

adhie
adhie

Reputation: 284

Apply strip to all the elements using map() and filter after that.

[item for item in map(lambda x: x.strip(), list) if item]

Upvotes: 1

Jon Clements
Jon Clements

Reputation: 142136

There is - create a generator of the stripped strings first, then use that:

stations = [row for row in (row.strip() for row in data) if row]

You could also write it without a comp, eg (swap to imap and remove list for Python 2.x):

stations = list(filter(None, map(str.strip, data)))

Upvotes: 29

Related Questions