Max Powers
Max Powers

Reputation: 1179

Parsing ifconfig output with python

I'm trying to parse this ifconfig output. I have seen another example on Stack Overflow where they did this same code however it's creating a nested list. However when I do the same thing I only get the first match options. Also, I would like to add the RX and TX packets into the list and that seems to not work as well.

Ifconfig output

Mg0_RSP0_CPU0_0 Link encap:Ethernet  HWaddr 70:e4:22:32:53:42
          inet addr:20.200.130.1  Mask:255.255.0.0
          inet6 addr: fe80::72e4:22ff:fe32:5342/64 Scope:Link
          UP RUNNING NOARP MULTICAST  MTU:1514  Metric:1
          RX packets:147918 errors:0 dropped:0 overruns:0 frame:0
          TX packets:119226 errors:0 dropped:0 overruns:0 carrier:3
          collisions:0 txqueuelen:1000
          RX bytes:103741434 (98.9 MiB)  TX bytes:5320623 (5.0 MiB)

Tg0_0_0_7_0 Link encap:Ethernet  HWaddr 78:ba:f9:35:66:46
          inet addr:13.13.13.1  Mask:255.255.255.0
          inet6 addr: fe80::7aba:f9ff:fe35:6646/64 Scope:Link
          UP RUNNING NOARP MULTICAST  MTU:1514  Metric:1
          RX packets:26 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5058 errors:0 dropped:0 overruns:0 carrier:3
          collisions:0 txqueuelen:1000
          RX bytes:1832 (1.7 KiB)  TX bytes:454625 (443.9 KiB)

Script

c = []
for paragraph in if_config_output.split('\n\n'):

    ma = re.compile("^(\S+).*?inet addr:(\S+).*?Mask:(\S+)", re.MULTILINE|re.DOTALL)

    result = ma.match(paragraph)

    if result != None:

        result = ma.match(paragraph)

        interface = result.group(1)
        ip = result.group(2)
        mac = result.group(3)

        #print "interface:", interface
        #print "ip:",ip
        #print "mask:", mask

        c.append([interface, ip, mac])

print c





In [145]: c
Out[145]: [['Mg0_RSP0_CPU0_0', '1.83.53.27', '255.255.0.0']]

Upvotes: 0

Views: 5192

Answers (2)

user10994671
user10994671

Reputation: 1

Your measurement is incorrect. You call ma.match(paragraph) twice in function.

result = ma.match(paragraph)
    if result:
        result = ma.match(paragraph)

Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 03:13:28)
split 0.569942316
regex 0.643881852

Upvotes: 0

brainovergrow
brainovergrow

Reputation: 458

Well, I've tested Your code, and at first got one result, second one:

>>> ['Tg0_0_0_7_0', '13.13.13.1', '255.255.255.0']

Then I looked closely at what was in Your regex and it appears that You might have additional new line before second paragraph like I had before my first, thus causing \S to stop. You could fix it with (if I am right about reason why You are getting single result), with adding \s? to beginning Your regex:

\s?^(\S+).*?inet addr:(\S+).*?Mask:(\S+)

Or, if this is the case of simple interface and IP retrieval You might use simpler and faster split...
I'll even timeit, if someone is curious:

import timeit
import re

if_config_output = """
Mg0_RSP0_CPU0_0 Link encap:Ethernet  HWaddr 70:e4:22:32:53:42
          inet addr:20.200.130.1  Mask:255.255.0.0
          inet6 addr: fe80::72e4:22ff:fe32:5342/64 Scope:Link
          UP RUNNING NOARP MULTICAST  MTU:1514  Metric:1
          RX packets:147918 errors:0 dropped:0 overruns:0 frame:0
          TX packets:119226 errors:0 dropped:0 overruns:0 carrier:3
          collisions:0 txqueuelen:1000
          RX bytes:103741434 (98.9 MiB)  TX bytes:5320623 (5.0 MiB)

Tg0_0_0_7_0 Link encap:Ethernet  HWaddr 78:ba:f9:35:66:46
          inet addr:13.13.13.1  Mask:255.255.255.0
          inet6 addr: fe80::7aba:f9ff:fe35:6646/64 Scope:Link
          UP RUNNING NOARP MULTICAST  MTU:1514  Metric:1
          RX packets:26 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5058 errors:0 dropped:0 overruns:0 carrier:3
          collisions:0 txqueuelen:1000
          RX bytes:1832 (1.7 KiB)  TX bytes:454625 (443.9 KiB)
"""

ma = re.compile("^\s?(\S+).*?inet addr:(\S+).*?Mask:(\S+)", re.MULTILINE|re.DOTALL)

def split(paragraph):
    """ ugly, but faster """
    interface = paragraph.split(" Link ")[0]
    inet_mask = paragraph.split("\n")[1].split(':')
    ip, mask = inet_mask[1], inet_mask[2]
    return [interface, ip, mask]

def regex(paragraph):

    result = ma.match(paragraph)
    if result:
        result = ma.match(paragraph)
        interface = result.group(1)
        ip = result.group(2)
        mac = result.group(3)
        return [interface, ip, mac]

def test_split():
    c = []
    for paragraph in if_config_output.split('\n\n'):
        c.append(split(paragraph))
    return len(c)

def test_regex():
    c = []
    for paragraph in if_config_output.split('\n\n'):
        c.append(regex(paragraph))
    return len(c)

print ("split", timeit.timeit(stmt=test_split, number=100000))
print ("regex", timeit.timeit(stmt=test_regex, number=100000))

results

$ python --version
Python 2.7.3
$ python test.py
('split', 3.096487045288086)
('regex', 5.066282033920288)
$ python3 --version
Python 3.2.3
$ python3 test.py
split 4.155041933059692
regex 4.875624895095825
$ python3 test.py
split 4.787220001220703
regex 5.695119857788086

Anyone with Python 3.5 care to join?

Huh, strangely inconclusive.

results from repl.it/languages/python3 (Python 3.4.0)
split 1.2351078800020332
regex 1.3363793969983817

results from ideone.com (Python 2.7.9) 
('split', 0.9004449844360352)
('regex', 0.7017428874969482)

and from ideone.com (Python 3.4.3+)
split 1.2050538789480925
regex 1.7611852046102285 

Upvotes: 1

Related Questions