Reputation: 53
I'm a network engineer, trying to dip my toes into programming. I got recommended to try Python.
What I'm trying to do is to save some specific data, matching a string with multiple lines with regexp. We got our data to work with stored in SourceData
.
SourceData = '
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'
The number of lines stored in SourceData
is always unknown. Could be 0 lines (empty) to unlimited lines.
I want to match all lines containing ipv4-addresses starting with 11.
This is what I've come up with as a start:
ip1 = re.search('11\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}', SourceData)
if ip1:
ip1 = ip1.group()
Verify:
>>> print ip1
11.22.33.44
OK, seems to work. The idea is that when the whole SourceData
is matched, with the example provided, the final result for this case would be 4 matches:
ip1 = 11.22.33.44
ip2 = 11.11.12.11
ip3 = 11.11.13.11
ip4 = 11.11.14.0
Next to learn, how do I continue to check SourceData
for more matches as described above, and how do I store the multiple matches for use later on in the code? For example, later in the code I would like to use the value from a specific match, lets say match number 4 (11.11.14.0
).
I have read some guidelines for Python and Regex, but it seems I quite don't understand it :)
Upvotes: 5
Views: 1389
Reputation: 43169
Several methods, one of them being:
import re
string = """
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'
"""
rx = re.compile(r'^[^\d\n]*(11(?:\.\d+){3})', re.M)
lines = [match.group(1) for match in rx.finditer(string)]
print(lines)
This yields:
['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']
^ # match start of the line
[^\d\n]* # NOT a digit or a newline, 0+ times
11 # 11
(?:\.\d+){3} # .0-9 three times
.+ # rest of the line
The rest is done via re.finditer()
and a list comprehension.
See a demo on regex101.com.
Upvotes: 3
Reputation: 71461
You can use re.findall
with a positive lookbehind to ensure that the correct address, just after "ip route"
, is being matched:
import re
s = """
ip route 22.22.22.22 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 33.33.33.33 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.22.33.44 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.12.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.13.11 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 11.11.14.0 255.255.255.255 TenGigabitEthernet0/1/0 1.1.1.1
ip route 44.44.44.0 255.255.255.0 TenGigabitEthernet0/1/0 1.1.1.1'
"""
final_ips = re.findall('(?<=ip route\s)11[\d\.]+', data)
Output:
['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']
Upvotes: 1
Reputation: 117906
You can use re.findall
to return all of the matches
>>> re.findall(r'11\.\d{1,3}\.\d{1,3}\.\d{1,3}', SourceData)
['11.22.33.44', '11.11.12.11', '11.11.13.11', '11.11.14.0']
Upvotes: 4