fj123x
fj123x

Reputation: 7512

python match regular expression

i need to compare a subject with a regex, and link the occurrences with a coincident key mask

key_mask = 'foo/{one}/bar/{two}/hello/{world}'

regex_mask = 'foo/(.*)/bar/(.*)/hello/(.*)'

subject = 'foo/test/bar/something/xxx'

the return should be:

{
"one": "test",
"two": "something",
"world": "xxx"
}

what is the best way to accomplish this result with the 3 inputs?

(this is for a simple url routing filtering like symfony http://symfony.com/doc/current/book/routing.html )

thanks!

Upvotes: 1

Views: 209

Answers (2)

Bakuriu
Bakuriu

Reputation: 101989

The simplest way would be to use named-groups, i.e. instead of a plain (.*) use (?P<name>.*) and then use the groupdict() method of the Match objects.

However, if you cannot change the inputs to your problem(because you are getting them from another library or whatever other reason, you can automatically create a named-group regex from the key_mask using re.sub and using a simple function as repl:

import re

def to_named_group(match):
    return '(?P<{}>.*)'.format(re.escape(match.group(0)[1:-1]))

def make_regex(key_mask):
    return re.compile(re.sub(r'\{[^}]+\}', to_named_group, key_mask))

def find_matches(key_mask, text):
    return make_regex(key_mask).match(text).groupdict()

Used as:

In [10]: find_matches('foo/{one}/bar/{two}/hello/{world}', 'foo/test/bar/something/hello/xxx')
Out[10]: {'one': 'test', 'two': 'something', 'world': 'xxx'}

Update based on your comment:

It's easy to pass into to_named_group further information on the regexes to produce. For example you could change the code to:

import re
from functools import partial

def to_named_groups(match, regexes):
    group_name = re.escape(match.group(0)[1:-1])
    group_regex = regexes.get(group_name, '.*')
    return '(?P<{}>{})'.format(group_name, group_regex)

def make_regex(key_mask, regexes):
    regex = re.sub(r'\{[^}]+\}', partial(to_named_groups, regexes=regexes),
                   key_mask)
    return re.compile(regex)

def find_matches(key_mask, text, regexes=None):
    if regexes is None:
        regexes = {}
    try:
        return make_regex(key_mask, regexes).search(text).groupdict()
    except AttributeError:
        return None

In this way you can control what should be matched by each named-group.

Upvotes: 2

alex vasi
alex vasi

Reputation: 5344

The simplest thing that comes to mind is to use named groups in regular expression:

>>> regex_mask = 'foo/(?P<one>.*)/bar/(?P<two>.*)/hello/(?P<world>.*)'
>>> subject = 'foo/test/bar/something/hello/xxx'
>>> re.match(regex_mask, subject).groupdict()
{'world': 'xxx', 'two': 'something', 'one': 'test'}

Upvotes: 3

Related Questions