Reputation: 619
So, I am successfully matching and extracting some special tagged text using the following regular expression:
theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
theIds = p.findall(theString)
That returns
[u'123453', u'984561', u'123456']
which is exactly what I need. Next, I need to replace those with some looked up value, so what I'd like to get next is this:
[u'Var 1 value: ', u', Var 2 value: ', u', Var 3 value: ']
So that I can glue those strings together with the looked up values from the first list, resulting in a string that looks something like this:
u"Var 1 value: Some Value, Var 2 value: 837, Var 3 value: more stuff"
Or, if there's a better way to do the replacement I'm all ears.
Thanks in advance!
Upvotes: 2
Views: 1414
Reputation: 1121524
Use a replacement function to insert arbitrary substitutions. See the re.sub
documentation for how the function works. Here is an example:
values = {
u'123453': u'Some Value',
u'984561': u'837',
u'123456': u'more stuff',
}
def insertLookup(matchobj):
return values[matchobj.group(1)]
theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
newString = p.sub(insertLookup, theString)
print newString
u"Var 1 value: Some Value, Var 2 value: 837, Var 3 value: more stuff"
The insertLookup
function will be called for each match, and is passed a MatchObject. We then use the matched value (u'123453'
, etc.) to look up the replacement value, which then is inserted into newString
instead of the matched string.
Upvotes: 2
Reputation: 208435
How about the following?
theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
replacements = ["Some Value", "837", "more stuff"]
newString = p.sub(lambda m: replacements.pop(0), theString)
You can provide a function to re.sub()
, in this case the function takes the first item from a replacements
list and substitutes it for the match.
edit: I misread the question and missed that you want to look up the replacement values based on the initial values, you probably want something like Martijn's answer for your replacement. As far as returning all text not matching, you can remove the group in your regex and then use re.split()
:
>>> theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
>>> p = re.compile("%%v:[0-9]*%%")
>>> p.split(theString)
[u'Var 1 value: ', u', Var 2 value: ', u', Var 3 value: ', u'']
Upvotes: 2
Reputation: 298126
Can't you just split(', ')
the string and work with the individual pieces?
A naive solution of mine would be something like this:
theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
for chunk in theString.split(', '):
temp = str(chunk)
p = re.compile("\%%v:([0-9]*)%%")
theIds = p.findall(theString)
theOpposite = temp.replace(theIds[0])
Upvotes: 0