w00tw00t
w00tw00t

Reputation: 53

Python regex to match multiple times, store results separately

I'm a network engineer, trying to dip my toes into programming. I got recommended to try Python.

What I'm trying to do is to save some specific data, matching a string with multiple lines with regexp. We got our data to work with stored in SourceData.

SourceData = '
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'

The number of lines stored in SourceData is always unknown. Could be 0 lines (empty) to unlimited lines.

I want to match all lines containing ipv4-addresses starting with 11.

This is what I've come up with as a start:

ip1 = re.search('11\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}', SourceData)
        if ip1:
            ip1 = ip1.group()

Verify:

>>> print ip1
11.22.33.44

OK, seems to work. The idea is that when the whole SourceData is matched, with the example provided, the final result for this case would be 4 matches:

ip1 = 11.22.33.44
ip2 = 11.11.12.11
ip3 = 11.11.13.11
ip4 = 11.11.14.0

Next to learn, how do I continue to check SourceData for more matches as described above, and how do I store the multiple matches for use later on in the code? For example, later in the code I would like to use the value from a specific match, lets say match number 4 (11.11.14.0).

I have read some guidelines for Python and Regex, but it seems I quite don't understand it :)

Upvotes: 5

Views: 1389

Answers (3)

Jan
Jan

Reputation: 43169

Several methods, one of them being:

import re

string = """
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'
"""

rx = re.compile(r'^[^\d\n]*(11(?:\.\d+){3})', re.M)

lines = [match.group(1) for match in rx.finditer(string)]
print(lines)    

This yields:

['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']


The core here is

^            # match start of the line
[^\d\n]*     # NOT a digit or a newline, 0+ times
11           # 11
(?:\.\d+){3} # .0-9 three times
.+           # rest of the line

The rest is done via re.finditer() and a list comprehension.
See a demo on regex101.com.

Upvotes: 3

Ajax1234
Ajax1234

Reputation: 71461

You can use re.findall with a positive lookbehind to ensure that the correct address, just after "ip route", is being matched:

import re
s = """
  ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
  ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
  ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
  ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
  ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
  ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
  ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1' 
 """
final_ips = re.findall('(?<=ip route\s)11[\d\.]+', data)

Output:

['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']

Upvotes: 1

Cory Kramer
Cory Kramer

Reputation: 117906

You can use re.findall to return all of the matches

>>> re.findall(r'11\.\d{1,3}\.\d{1,3}\.\d{1,3}', SourceData)
['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']

Upvotes: 4

Related Questions