Reputation: 4301
For example, we want to remove all characters before the first a
from 123a45b6a789
. How to obtain the correct result of 45b6a789
?
I tried re.sub('.*a', '', '123a45b6a789')
but it gives 789
.
Thanks.
Upvotes: 2
Views: 262
Reputation: 536
As Chan said: "we want to remove all characters before the first a", in another words, we need to remove all characters which is not 'a' from begin to 'a', so we should remove the first non-a string and the first a, ^[^a]*a
.
import re
print re.sub("^[^a]*a", u"", u"123a45b6a789") # output: 45b6a789
print re.sub("^[^a]*", u"", u"123a45b6a789") # output: a45b6a789
I simply test the cost time about some methods in Python2.7 linux 16.04
, my method is more quick, as follows:
%timeit _ = re.sub("^[^a]*a", u"", '24579999999999999999999999999999999999999999999999999999999999999912734162854614678567ijkljklhhjkja45b6a789')
#1000000 loops, best of 3: 1.29 µs per loop
%timeit _ = re.sub('^.*?a', '', '24579999999999999999999999999999999999999999999999999999999999999912734162854614678567ijkljklhhjkja45b6a789')
# 1000000 loops, best of 3: 1.93 µs per loop
Upvotes: 0
Reputation: 57155
First of all, using a non-greedy wildcard *?
will prevent the whole string up to the last a
from being gobbled.
But that's not quite sufficient. This code will illustrate the problem:
print(re.findall(r'.*?a', '123a45b6a789')) # => ['123', '45b6'] # <-- whoops, matched twice
You can therefore use re.sub
's count parameter to limit yourself to the first match:
re.sub(r'.*?a', '', '123a45b6a789', 1)
# ^^^
Or use a beginning-of-line anchor:
re.sub(r'^.*?a', '', '123a45b6a789')
Or, skip regex entirely and use constt's solution.
Upvotes: 2
Reputation: 3756
Use the Non greedy ?
re.sub('.*?a', '', '123a45b6a789')` but it gives `789`
I’d suggest trying out regex on regex webapps to help demystify this. Just google regex and you’ll find one.
Upvotes: 0
Reputation: 1031
Well there's a ton of different ways to skin a cat. But you could do something like the following:
def removeCharBeforeKey(string, key):
return key.join(string.split(key)[1:]))
where key is the keyword (a
) for example. and the string is your input (123a45b6a789
) in this example.
This is saying ok split the string on the keyword, but then rejoin after the first one. You could also find
the index and just go one more than that first index.
Upvotes: 0