Reputation: 23
for element in f:
galcode_scan = re.search(ur'blah\.blah\.blah\(\'\w{5,10}', element)
If I try to perform re.sub and remove the blahs with something else and keep the last bit, the \w{5,10} becomes literal. How do I retain the characters that are taken up by that chunk of the regular expression?
EDIT:
Here is the complete code
for element in f:
galcode_scan = re.search(ur'Imgur\.Util\.triggerView\(\'\w{5,10}', element)
galcode_scan = re.sub(r'Imgur\.Util\.triggerView\(\'\w{5,10}', 'blah\.\w{5,10}', ur"galcode_scan\.\w{5,10}")
print galcode_scan
Upvotes: 1
Views: 1023
Reputation: 91017
You can as well work this way:
import re
element = "Imgur.Util.triggerView('glglgl')"
galcode_scan = re.search(ur'Imgur\.Util\.triggerView\(\'(\w{5,10})\'\)', element)
Now you have a match object which you can furtherly use: with either of
galcode_scan.expand('replacement.\\1')
galcode_scan.expand('replacement.\g<1>')
you get replacement.glglgl
as a result.
This works by applying the replacement string with the captured groups.
Upvotes: 0
Reputation: 4628
You can use positive lookahead ((?=...)
) to not to match when replacing but matching as a whole pattern:
re.sub("blah\.blah\.blah\(\'(?=\w{5,10})", "", "blah.blah.blah('qwertyu")
'qwertyu'
If you want to replace you match, just add it to the replacement parameter:
re.sub("blah\.blah\.blah\(\'(?=\w{5,10})", "pref:", "blah.blah.blah('qwertyu")
'pref:qwertyu'
You can also do it by capturing the pattern ((..)
) and back-referencing it (\1
.. \9
):
re.sub("blah\.blah\.blah\(\'(\w{5,10})", "pref:\\1", "blah.blah.blah('qwertyu")
'pref:qwertyu'
Update
A more precise pattern for the provided exmples:
re.sub("Imgur\.Util\.triggerView'(?=\w{5,10})", "imgurl.com/", "Imgur.Util.triggerView'B1ahblA4")
'imgurl.com/B1ahblA4'
The pattern here is a simple string, so whatever you need to make dynamic you can use a variable for it. For example to use different mappings:
map = {
'Imgur\.Util\.triggerView\'': 'imgurl.com/',
'Example\.Util\.triggerView\'': 'example.com/'
}
items = [
"Imgur.Util.triggerView'B1ahblA4",
"Example.Util.triggerView'FooBar"
]
for item in items:
for old, new in map.iteritems():
pattern = old + '(?=\w{5,10})'
if re.match(pattern, item):
print re.sub(pattern, new, item)
imgurl.com/B1ahblA4
example.com/FooBar
Upvotes: 1