iargue
iargue

Reputation: 69

Python Repeated Capture Groups

I'm attempting to parse a series of SHOW CDP NEIGHBORS DETAIL outputs so I can capture each device and its ip address.

The issue that I am coming across is that some devices may have multiple ip addresses assigned to it, here is an example output.

Device ID: RTPER1.MFN21Mb.domain.local
Entry address(es): 
  IP address: 200.152.51.3
  IP address: 82.159.177.233
  IP address: 201.152.51.140
  IP address: 84.252.100.3
Platform: Cisco 2821,  Capabilities: Router Switch IGMP 

I wrote this regex to capture the input, and according to gskinner it matches all 4 ip addresses, but the capture is just the last one (as expected from regex)

Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)+

So I went online to figure out how to do this. I tried teh regex suggested here Capturing repeating subpatterns in Python regex but using the regex module did not change the output. I still only get the last ip address on the list, and none of the others.

Following the example I get

temp = regex.match(r'Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)+', file)
print temp

Temp returns None.

If I do findall. I get a return of just the last ip address 84.252.100.3

If I add multiple capture groups, such as

temp = regex.findall(r'Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?\s+(?:IP address: ([0-9.]+)\s+)?', file)
print temp

It only matches the ones that have mutliple ip addresses, and not the others

Hopefully someone can help.

Upvotes: 2

Views: 1715

Answers (1)

Andrew Cheong
Andrew Cheong

Reputation: 30273

As far as I'm aware, only .NET allows you to iterate through quantified (repeated) capturing groups. Consider this (finite) alternative:

Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+(?:IP address: ([0-9.]+)\s+)(?:IP address: ([0-9.]+)\s+)?(?:IP address: ([0-9.]+)\s+)?(?:IP address: ([0-9.]+)\s+)?
                                                                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This will capture up 1 IP address in $2 and up to three more in $3, $4, and $5. (I'm using the $ notation idiomatically, of course.) You can add as many as you want. If you need all of the IP addresses to be present in a single group, i.e. $2, then your only choice is to include the text with them:

Device ID: ([0-9A-Za-z-.&]+)\s+Entry address\(es\):\s+((?:IP address: (?:[0-9.]+)\s+)+)
                                                      ^                ^^             ^

Upvotes: 1

Related Questions