Austin
Austin

Reputation: 135

Find all occurrences of multiple regex conditions using python regex

Given 2 different regex patterns, i want to find all occurrences of those 2 patters. If only pattern 1 matches then return that, if only pattern 2 matches then return that and if pattern 1 and pattern 2 matches then return both of them. So how do i run multiple(in this case 2 regex) in one statement?

Given input string :

"https://test.com/change-password?secret=12345;[email protected];previous_password=hello;new=1"

I want to get the value of email and secret only. So i want the output as ['12345', '[email protected]']

import re
print(re.search(r"(?<=secret=)[^;]+", s).group())
print(re.search(r"(?<=email=)[^;]+", s).group())

I am able to get the expected output by running the regex multiple times. How do i achieve it within a single statement? I dont want to run re.search 2 times. Can i achieve this within one search statement?

Upvotes: 0

Views: 1778

Answers (4)

Austin
Austin

Reputation: 135

So i ended up using the urllib as suggested by @ctwheels

url_exclude = ["email", "secret"]
import urllib.parse as urlparse
from urllib.parse import urlencode, urlunparse
url_parsed_string = urlparse.urlparse(input_string)
parsed_columns = urlparse.parse_qs(url_parsed_string.query)
for exclude_column in url_exclude:
    if exclude_column in parsed_columns:
        parsed_columns[exclude_column] = "xxxxxxxxxx"
qstr = urlencode(parsed_columns)
base_url = urlunparse((url_parsed_string.scheme, url_parsed_string.netloc, 
url_parsed_string.path, url_parsed_string.params, qstr, 
url_parsed_string.fragment))
print(base_url)

Upvotes: 0

Jan
Jan

Reputation: 43169

You could use a dict comprehension:

import re
url = "https://test.com/change-password?secret=12345;[email protected];previous_password=hello;new=1"

rx = re.compile(r'(?P<key>\w+)=(?P<value>[^;]+)')

dict_ = {m['key']: m['value'] for m in rx.finditer(url)}

# ... then afterwards ...
lst_ = [value for key in ("secret", "email") if key in dict_ for value in [dict_[key]]]
print(lst_)
# ['12345', '[email protected]']

Upvotes: 1

NybbleStar
NybbleStar

Reputation: 71

import re

print(re.findall("(?<=secret=)[^;]+|(?<=email=)[^;]+", s))

# output
# ['12345', '[email protected]']

Upvotes: 1

evnp
evnp

Reputation: 301

>>> re.findall(r"((?:(?<=email=)|(?<=secret=))[^;]+)", s)
['12345', '[email protected]']

But now you'll need a way of identifying which of the resulting values is the secret and which is the email. I'd recommend also extracting this information with the regex (which also eliminates the lookbehind):

>>> dict(kv.split('=') for kv in re.findall(r"((?:secret|email)=[^;]+)", s))
{'secret': '12345', 'email': '[email protected]'}

Upvotes: 3

Related Questions