Reputation: 5468
I have a string:
unit 3 {
arp-options {
aging-timer 5;
}
family inet4 {
address 2.33.1.2/255.255.255.0;
address 2.33.2.2/255.255.255.0;
address 2.33.3.2/255.255.255.0;
address 2.33.4.2/255.255.255.0;
}
}
I want to extract the IPV4 addresses ONLY under the family inet4
section. I can use the regex \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
for IP addresses but how to get all the IP addresses in one shot?
Upvotes: 0
Views: 306
Reputation: 133730
With your shown samples, please try following.
##importing re library of python here.
import re
##Creating variable val which has OP's shown value here.
val="""unit 3 {
arp-options {
aging-timer 5;
}
family inet4 {
address 2.33.1.2/255.255.255.0;
address 2.33.2.2/255.255.255.0;
address 2.33.3.2/255.255.255.0;
address 2.33.4.2/255.255.255.0;
}
}"""
##Creating val1 which will get all values from family inet4 to till last }.
val1=re.findall(r"(family inet4 {\n(?:.*;\n){1,}\s+})",val,re.MULTILINE)
##Creating regex2 here with .compile which will exactly match IP addresses which are just before / here.
regex2=re.compile(r"(\d{1,3}(?:\.\d{1,3}){3})/(\d{1,3}(?:\.\d{1,3}){3})")
##Using findall on val2 value to get all IP addresses here.
regex2.findall(str(val1))
[('2.33.1.2', '255.255.255.0'), ('2.33.2.2', '255.255.255.0'), ('2.33.3.2', '255.255.255.0'), ('2.33.4.2', '255.255.255.0')]
Upvotes: 1
Reputation: 627341
You can if you use PyPi regex module:
import regex
text = """unit 3 {
arp-options {
aging-timer 5;
}
family inet4 {
address 2.33.1.2/255.255.255.0;
address 2.33.2.2/255.255.255.0;
address 2.33.3.2/255.255.255.0;
address 2.33.4.2/255.255.255.0;
}
}"""
matches = [x.captures(1) for x in regex.finditer(r"family inet4\s*{(?:\s*address\s+([\d./]+);)*\s*}", text)]
print([x for l in matches for x in l])
## => ['2.33.1.2/255.255.255.0', '2.33.2.2/255.255.255.0', '2.33.3.2/255.255.255.0', '2.33.4.2/255.255.255.0']
See an online Python demo.
The family inet4\s*{(?:\s*address\s+([\d./]+);)*\s*}
regex matches
family inet4
- a string\s*
- zero or more whitespace{
- a {
char(?:\s*address\s+([\d./]+);)*
- zero or more occurrences of
\s*address\s+
- zero or more whitespaces, address
word, one or more whitespaces([\d./]+)
- Group 1: one or more digits, .
, or /
chars;
- a ;
char\s*
- zero or more whitespaces}
- a }
char.With a standard re
in Python, you can still use a single family inet4\s*{([^{}]*)}
regex and some more post-processing to get the same:
import re
text = "STRING_HERE"
m = re.search(r"family inet4\s*{([^{}]*)}", text)
if m:
res = [x.strip().split()[-1].strip(';') for x in m.group(1).strip().splitlines()]
print(res)
See this Python demo. Here, family inet4\s*{([^{}]*)}
matches family inet4
, zero or more whitespaces, {
, and then captures any zero or more chars into Group 1, and then matches a }
.
Then, all whitespaces are stripped from the match, the match text is split into lines, and each line is processed with x.strip().split()[-1].strip(';')
: 1) stipping whitespace first, then splitting with whitespace and getting the last item (the IP here), and then stipping the ;
from the value.
Upvotes: 1
Reputation: 522571
I would extract the family inet4
section and then use re.findall
:
inp = """unit 3 {
arp-options {
aging-timer 5;
}
family inet4 {
address 2.33.1.2/255.255.255.0;
address 2.33.2.2/255.255.255.0;
address 2.33.3.2/255.255.255.0;
address 2.33.4.2/255.255.255.0;
}
}"""
inet = re.findall(r'\bfamily inet4 \{\s+(.*?)\s+\}', inp, flags=re.DOTALL)
ip_addresses = re.findall(r'\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b', inet[0])
print(ip_addresses)
This prints:
['2.33.1.2', '255.255.255.0', '2.33.2.2', '255.255.255.0',
'2.33.3.2', '255.255.255.0', '2.33.4.2', '255.255.255.0']
Upvotes: 2