lucemia
lucemia

Reputation: 6617

regular expression how to

It is hard to explain clearly, please see the following code. Basically I want a simple logic: 1. if the pattern match, then return repl. 2. if the repl contains reference then replace reference with match.

patterns = [ 
 #pattern, repl
 (r'utm_source', r'ad'),
 (r'utm_source=([\w]+)', r'ad:\1'),
 (r'mayuki', r'visit'),
 (r'showProduct', r'product'),
 (r'CrShopCar',   r'cart'),
 (r'CrShopCar03', r'payment'),
]

def parse(url):
    # for each pattern, if match
    # return it's repl
    # the following will failed 
    actions = []
    for pattern, repl in patterns:
        if re.findall(pattern, url):
            actions.append(re.sub(pattern, repl, url))
    return actions



assert parse('http://www.mayuki.com.tw') == ["visit"]
assert parse('www.mayuki.com.tw/showProduct=123') == ["visit", "product"]
assert parse('www.mayuki.com.tw/?utm_source=yahoo') == ["ad", "ad:yahoo"]

Upvotes: 0

Views: 54

Answers (1)

georg
georg

Reputation: 214949

I guess the last one should return ['ad', 'ad:yahoo', 'visit']. Given that,

for pattern, repl in patterns:
    m = re.search(pattern, url)
    if m:
        actions.append(m.expand(repl))

docs: expand

Upvotes: 3

Related Questions