Reputation: 4572
I have a love/hate relationship with list comprehension. On the one hand I think they are neat and elegant. On the other hand I hate reading them. (especially ones I didn't write) I generally follow the rule of, make it readable until speed is required. So my question is really academic at this point.
I want a list of stations from a table who's strings often have extra spaces. I need those spaces stripped out. Sometimes those stations are blank and should not be included.
stations = []
for row in data:
if row.strip():
stations.append(row.strip())
Which translates to this list comprehension:
stations = [row.strip() for row in data if row.strip()]
This works well enough, but it occurs to me that I'm doing strip twice. I guessed that .strip() was not really needed twice and is generally slower than just assigning a variable.
stations = []
for row in data:
blah = row.strip()
if blah:
stations.append(blah)
Turns out I was correct.
> Striptwice list comp 14.5714301669
> Striptwice loop 17.9919670399
> Striponce loop 13.0950567955
Timeit shows between the two loop segments, the 2nd (strip once) is faster. No real surprise here. I am surprised that list comprehension is only marginally slower even though it's doing a strip twice.
My question: Is there a way to write a list comprehension that only does the strip once?
Results:
Here are the timing results of the suggestions
# @JonClements & @ErikAllik
> Striptonce list comp 10.7998494348
# @adhie
> Mapmethod loop 14.4501044569
Upvotes: 15
Views: 1008
Reputation: 38217
Nested comprehensions can be tricky to read, so my first preference would be:
stripped = (x.strip() for x in data)
stations = [x for x in stripped if x]
Or, if you inline stripped
, you get a single (nested) list comprehension:
stations = [x for x in (x.strip() for x in data) if x]
Note that the first/inner comprehension is a actually generator expression, which, in other words is a lazy list comprehension; this is to avoid iterating twice.
Upvotes: 13
Reputation: 284
Apply strip to all the elements using map() and filter after that.
[item for item in map(lambda x: x.strip(), list) if item]
Upvotes: 1
Reputation: 142136
There is - create a generator of the stripped strings first, then use that:
stations = [row for row in (row.strip() for row in data) if row]
You could also write it without a comp, eg (swap to imap
and remove list
for Python 2.x):
stations = list(filter(None, map(str.strip, data)))
Upvotes: 29