Reputation: 219
I am trying to use re.compile in a for line loop. After the re.compile running a re.match on the lines. Using Regex101 I get the correct matches on the groups but running inside python it returns None for a group that has an empty string. What I am after is matching on groups even if they are empty.
The string to match on:
Interface Status Protocol Description
BE1 up up
Mg0/RSP0/CPU0/0 up up NNI to Cat2960x G1/0/1
Te0/0/0/3 admin-down admin-down
Gi0/0/1/0 down down Test L2VPN
RP/0/RSP0/CPU0:LAB-9001-1#
The last group (description) should be optional and can either contain a description or be empty. This works in Regex101 and I have 4 groups on this filter:
^\s*(?:(?P<interface>[a-zA-Z0-9]\S+?))\s+(?:(?P<status>[up|admin\-down]\S+?))\s+(?:(?P<protocol>[up|admin\-down]\S+))\s+(?:(?P<description>(?<!^).*))
On the code I am using compile and match but if the description is blank it returns None, when I want it to return the first 3 groups and an empty string for the 4th group (description).
for line in result.splitlines():
line = line.rstrip()
p1 = re.compile(r'^\s*(?:(?P<interface>[a-zA-Z0-9]\S+?))\s+(?:(?P<status>[up|admin\-down]\S+?))\s+(?:(?P<protocol>[up|admin\-down]\S+))\s+(?P<description>(?<!^).*)')
m = p1.match(line).groups()
print(m)
this will not match on anything that is blank for description. Is their a syntax to tell re.match to include empty groups?
Upvotes: 1
Views: 1457
Reputation: 627119
The regex you use contains character classes instead of grouping constructs ([up|down]
does not match up
or down
, it matches u
, p
, |
, d
, o
, w
or n
) and the last pattern part must match an obligatory space+any chars, but your rstrip
the line and there is no space left to match.
The fixed regex looks like
^(?P<interface>[a-zA-Z0-9]\S*)\s+(?P<status>up|admin-down)\s+(?P<protocol>up|admin-down)(?:\s+(?P<description>.*))?
See the Regulex graph:
Details
^
- start of string(?P<interface>[a-zA-Z0-9]\S*)
- Group "interface": an alphanumeric followed with any 0+ non-whitespace chars\s+
- 1+ whitespaces(?P<status>up|admin-down)
- Group "status": up
or admin-down
\s+
- 1+ whitespaces(?P<protocol>up|admin-down)
- Group "protocol": up
or admin-down
(?:\s+(?P<description>.*))?
- an optional group:
\s+
- 1+ whitespaces(?P<description>.*)
- Group "description": any 0+ chars other than line break as many as possible In Python, you may use
import re
result = r"""Interface Status Protocol Description
BE1 up up
Mg0/RSP0/CPU0/0 up up NNI to Cat2960x G1/0/1
Te0/0/0/3 admin-down admin-down
Gi0/0/1/0 down down Test L2VPN
RP/0/RSP0/CPU0:LAB-9001-1#"""
p1 = re.compile(r'(?P<interface>[a-zA-Z0-9]\S*)\s+(?P<status>up|admin-down)\s+(?P<protocol>up|admin-down)(?:\s+(?P<description>.*))?')
for line in result.splitlines():
line = line.rstrip()
m = p1.match(line)
if m:
print(m.groups())
See the Python demo
Note the ^
start of string anchor is not necessary if you use re.match
.
Upvotes: 1