Reputation: 1012
I have a list that looks like
test = ['A','B','C','D D','E E','F F']
I would like test to become the following (that is, the spaces removed)
test = ['A', 'B', 'C', 'DD', 'EE', 'FF']
I used a list comprehension in Python to achieve this:
>>> [re.sub(' ','',i) for i in test]
['A', 'B', 'C', 'DD', 'EE', 'FF']
My question is - what if I explicitly DO NOT want re.sub(' ','',i)
to run on the first three elements of my list? I only want the re.sub
function to run on 'DD','EE', and 'FF'.
Is this way efficient? I understand a list comprehension takes up memory because Python makes a copy.
test2[3:] = [re.sub(' ','',i) for i in test[3:]]
Or should I just loop through the values of test that I want to modify like this:
for i in range(3,len(test)):
print i
test[i] = re.sub(' ','',test[i])
Upvotes: 1
Views: 252
Reputation: 239683
The best of re.sub
, str.replace
and str.translate
is the str.replace
. So, use str.replace
Here is a little timing comparison.
import re
def test1():
test = ['A','B','C','D D','E E','F F']
test[3:] = [re.sub(' ','',i) for i in test[3:]]
def test2():
test = ['A','B','C','D D','E E','F F']
test[3:] = [i.replace(" ", "") for i in test[3:]]
def test3():
test = ['A','B','C','D D','E E','F F']
test[3:] = [item.translate(None, " ") for item in test[3:]]
from timeit import timeit
print timeit("test1()", "from __main__ import test1")
print timeit("test2()", "from __main__ import test2")
print timeit("test3()", "from __main__ import test3")
Output on my machine
3.96201109886
0.985305070877
1.11600804329
Note: As @roippi mentioned in the comments, str.translate
will not work in this form in Python 3.x. So, ignore that in the race, if you are using Python 3.x
Upvotes: 2
Reputation: 25974
My question is - what if I explicitly DO NOT want re.sub(' ','',i) to run on the first three elements of my list?
Okay, answering that question first:
You can use enumerate
and a conditional expression to specify the behavior you want for i < 3 and i >= 3:
[x if i<3 else re.sub(' ','',x) for i,x in enumerate(test)]
['A', 'B', 'C', 'DD', 'EE', 'FF']
Note that this simple sub
operation can be handled more straightforwardly by str.replace
.
(I will leave out discussion of whether this sort of optimization is worthwhile, other than saying the time saved by not doing re.sub
on the first three elements is miniscule)
Upvotes: 1
Reputation: 500933
First of all, it sounds like you're optimizing prematurely.
Secondly, you can express your requirements with a single list comprehension:
In [5]: test = ['A','B','C','D D','E E','F F']
In [6]: [t if i < 3 else re.sub(' ', '', t) for (i, t) in enumerate(test)]
Out[6]: ['A', 'B', 'C', 'DD', 'EE', 'FF']
Finally, my advice would be to focus on correctness first, then on readability. Once you've achieved those, profile the code to see where the bottlenecks are, and only then optimize for performance.
Upvotes: 3