Michael Hecht
Michael Hecht

Reputation: 2251

Python regular expression with different levels of greedyness

I want to extract the ip address from following string with Python re

"aa:xv172.31.2.27bb,"

Using the following pattern

ip_address = re.sub(r'.*([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}).*',
                    r'\1',ip_address)

results in

2.31.2.27

because the first operator is as greedy as possible. I want the ip address matcher to be "more greedy" to get the full ip address. How to do this?

Upvotes: 1

Views: 53

Answers (1)

Sundeep
Sundeep

Reputation: 23667

Use re.search when you want to extract something:

>>> s = "aa:xv172.31.2.27bb,"
>>> re.search(r'\d{1,3}(\.\d{1,3}){3}', s)[0]
'172.31.2.27'

If you want to know how to do it with re.sub for this case, use non-greedy for the first .*:

>>> re.sub(r'.*?(\d{1,3}(\.\d{1,3}){3}).*', r'\1', s)
'172.31.2.27'

\d isn't the right way to match IP address range. You can either construct the pattern yourself, or use a module such as https://github.com/madisonmay/CommonRegex (this one uses 25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]? which can be further simplified, but you get the idea).

See also: https://docs.python.org/3/howto/ipaddress.html

Upvotes: 1

Related Questions