daniel402
daniel402

Reputation: 67

Implicit or undefined match object

What I've leaned from working with Python's re module is that you always create a match object when working with re functions

Could someone please explain why and how this piece of code works? I can't get through it.

import re

text = "1 < than 2 > 0 & not 'NULL'"

html_escapes = {'&': '&amp;',
                '<': '&lt;',
                '>': '&gt;',
                '"': '&quot;',
                '\'': '&apos;'}


def multiwordreplace(txt, worddict):
    rc = re.compile('|'.join(map(re.escape, worddict)))
    def translate(match):
        return worddict[match.group(0)]
    return rc.sub(translate, txt)

print multiwordreplace(text, html_escapes)

Where does this match object come from?

Upvotes: 0

Views: 200

Answers (2)

jonrsharpe
jonrsharpe

Reputation: 122091

x = re.compile(a)
x.sub(b, c)

is equivalent to

re.sub(a, b, c)

i.e. the compiled regex a is the pattern, b is the replacement repl and c is the string.

In this case, the repl is a function, translate. From the docs:

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

The match parameter is supplied by re.sub for each match in the string, and the function returns the appropriate replacement from worddict to substitute into txt.

You could also write it as:

return rc.sub(lambda match: worddict[match.group(0)], txt)

Upvotes: 3

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477190

I assume that you mean where match in:

def translate(match):
    return worddict[match.group(0)]

originates from. Python supports the concept of functional programming where one can pass a function as an argument.

If you thus call re.sub as:

rc.sub(translate, txt)

translate is a function. And what rc.sub does is looking for matches. Each match, one calls the function with a generated argument. The result is the substitute of that function.

Another example is the map function:

def map(f, lst):
    result = []
    for x in lst:
        result.append(f(x))
    return result

Thus what happens is you call map with a function f. Then you iterate over the lst and for each element x, you call f with x. The result is appended to the list.

You thus don't have to pass translate with an argument to get a value, you can pass the function such that another function can call that function with several (different) values itself.

Upvotes: 0

Related Questions