machomeautoguy
machomeautoguy

Reputation: 619

How do I get a python regular expression to return all text not matching?

So, I am successfully matching and extracting some special tagged text using the following regular expression:

theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
theIds = p.findall(theString)

That returns

[u'123453', u'984561', u'123456']

which is exactly what I need. Next, I need to replace those with some looked up value, so what I'd like to get next is this:

[u'Var 1 value: ', u', Var 2 value: ', u', Var 3 value: ']

So that I can glue those strings together with the looked up values from the first list, resulting in a string that looks something like this:

u"Var 1 value: Some Value, Var 2 value: 837, Var 3 value: more stuff"

Or, if there's a better way to do the replacement I'm all ears.

Thanks in advance!

Upvotes: 2

Views: 1414

Answers (4)

Martijn Pieters
Martijn Pieters

Reputation: 1121524

Use a replacement function to insert arbitrary substitutions. See the re.sub documentation for how the function works. Here is an example:

values = {
    u'123453': u'Some Value',
    u'984561': u'837',
    u'123456': u'more stuff',
}

def insertLookup(matchobj):
    return values[matchobj.group(1)]

theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
newString = p.sub(insertLookup, theString)

print newString
u"Var 1 value: Some Value, Var 2 value: 837, Var 3 value: more stuff"

The insertLookup function will be called for each match, and is passed a MatchObject. We then use the matched value (u'123453', etc.) to look up the replacement value, which then is inserted into newString instead of the matched string.

Upvotes: 2

Andrew Clark
Andrew Clark

Reputation: 208435

How about the following?

theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
p = re.compile("\%%v:([0-9]*)%%")
replacements = ["Some Value", "837", "more stuff"]
newString = p.sub(lambda m: replacements.pop(0), theString)

You can provide a function to re.sub(), in this case the function takes the first item from a replacements list and substitutes it for the match.

edit: I misread the question and missed that you want to look up the replacement values based on the initial values, you probably want something like Martijn's answer for your replacement. As far as returning all text not matching, you can remove the group in your regex and then use re.split():

>>> theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"
>>> p = re.compile("%%v:[0-9]*%%")
>>> p.split(theString)
[u'Var 1 value: ', u', Var 2 value: ', u', Var 3 value: ', u'']

Upvotes: 2

Jim Clay
Jim Clay

Reputation: 1003

Instead of "p.findall" use "p.sub".

Upvotes: 0

Blender
Blender

Reputation: 298126

Can't you just split(', ') the string and work with the individual pieces?

A naive solution of mine would be something like this:

theString = u"Var 1 value: %%v:123453%%, Var 2 value: %%v:984561%%, Var 3 value: %%v:123456%%"

for chunk in theString.split(', '):
  temp = str(chunk)

  p = re.compile("\%%v:([0-9]*)%%")
  theIds = p.findall(theString)

  theOpposite = temp.replace(theIds[0])

Upvotes: 0

Related Questions