Reputation: 1179
I'm trying to parse this ifconfig output. I have seen another example on Stack Overflow where they did this same code however it's creating a nested list. However when I do the same thing I only get the first match options. Also, I would like to add the RX and TX packets into the list and that seems to not work as well.
Ifconfig output
Mg0_RSP0_CPU0_0 Link encap:Ethernet HWaddr 70:e4:22:32:53:42
inet addr:20.200.130.1 Mask:255.255.0.0
inet6 addr: fe80::72e4:22ff:fe32:5342/64 Scope:Link
UP RUNNING NOARP MULTICAST MTU:1514 Metric:1
RX packets:147918 errors:0 dropped:0 overruns:0 frame:0
TX packets:119226 errors:0 dropped:0 overruns:0 carrier:3
collisions:0 txqueuelen:1000
RX bytes:103741434 (98.9 MiB) TX bytes:5320623 (5.0 MiB)
Tg0_0_0_7_0 Link encap:Ethernet HWaddr 78:ba:f9:35:66:46
inet addr:13.13.13.1 Mask:255.255.255.0
inet6 addr: fe80::7aba:f9ff:fe35:6646/64 Scope:Link
UP RUNNING NOARP MULTICAST MTU:1514 Metric:1
RX packets:26 errors:0 dropped:0 overruns:0 frame:0
TX packets:5058 errors:0 dropped:0 overruns:0 carrier:3
collisions:0 txqueuelen:1000
RX bytes:1832 (1.7 KiB) TX bytes:454625 (443.9 KiB)
Script
c = []
for paragraph in if_config_output.split('\n\n'):
ma = re.compile("^(\S+).*?inet addr:(\S+).*?Mask:(\S+)", re.MULTILINE|re.DOTALL)
result = ma.match(paragraph)
if result != None:
result = ma.match(paragraph)
interface = result.group(1)
ip = result.group(2)
mac = result.group(3)
#print "interface:", interface
#print "ip:",ip
#print "mask:", mask
c.append([interface, ip, mac])
print c
In [145]: c
Out[145]: [['Mg0_RSP0_CPU0_0', '1.83.53.27', '255.255.0.0']]
Upvotes: 0
Views: 5192
Reputation: 1
Your measurement is incorrect. You call ma.match(paragraph)
twice in function.
result = ma.match(paragraph)
if result:
result = ma.match(paragraph)
Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 03:13:28)
split 0.569942316
regex 0.643881852
Upvotes: 0
Reputation: 458
Well, I've tested Your code, and at first got one result, second one:
>>> ['Tg0_0_0_7_0', '13.13.13.1', '255.255.255.0']
Then I looked closely at what was in Your regex and it appears that You might have additional new line before second paragraph like I had before my first, thus causing \S to stop. You could fix it with (if I am right about reason why You are getting single result), with adding \s? to beginning Your regex:
\s?^(\S+).*?inet addr:(\S+).*?Mask:(\S+)
Or, if this is the case of simple interface and IP retrieval You might use simpler and faster split...
I'll even timeit, if someone is curious:
import timeit
import re
if_config_output = """
Mg0_RSP0_CPU0_0 Link encap:Ethernet HWaddr 70:e4:22:32:53:42
inet addr:20.200.130.1 Mask:255.255.0.0
inet6 addr: fe80::72e4:22ff:fe32:5342/64 Scope:Link
UP RUNNING NOARP MULTICAST MTU:1514 Metric:1
RX packets:147918 errors:0 dropped:0 overruns:0 frame:0
TX packets:119226 errors:0 dropped:0 overruns:0 carrier:3
collisions:0 txqueuelen:1000
RX bytes:103741434 (98.9 MiB) TX bytes:5320623 (5.0 MiB)
Tg0_0_0_7_0 Link encap:Ethernet HWaddr 78:ba:f9:35:66:46
inet addr:13.13.13.1 Mask:255.255.255.0
inet6 addr: fe80::7aba:f9ff:fe35:6646/64 Scope:Link
UP RUNNING NOARP MULTICAST MTU:1514 Metric:1
RX packets:26 errors:0 dropped:0 overruns:0 frame:0
TX packets:5058 errors:0 dropped:0 overruns:0 carrier:3
collisions:0 txqueuelen:1000
RX bytes:1832 (1.7 KiB) TX bytes:454625 (443.9 KiB)
"""
ma = re.compile("^\s?(\S+).*?inet addr:(\S+).*?Mask:(\S+)", re.MULTILINE|re.DOTALL)
def split(paragraph):
""" ugly, but faster """
interface = paragraph.split(" Link ")[0]
inet_mask = paragraph.split("\n")[1].split(':')
ip, mask = inet_mask[1], inet_mask[2]
return [interface, ip, mask]
def regex(paragraph):
result = ma.match(paragraph)
if result:
result = ma.match(paragraph)
interface = result.group(1)
ip = result.group(2)
mac = result.group(3)
return [interface, ip, mac]
def test_split():
c = []
for paragraph in if_config_output.split('\n\n'):
c.append(split(paragraph))
return len(c)
def test_regex():
c = []
for paragraph in if_config_output.split('\n\n'):
c.append(regex(paragraph))
return len(c)
print ("split", timeit.timeit(stmt=test_split, number=100000))
print ("regex", timeit.timeit(stmt=test_regex, number=100000))
results
$ python --version
Python 2.7.3
$ python test.py
('split', 3.096487045288086)
('regex', 5.066282033920288)
$ python3 --version
Python 3.2.3
$ python3 test.py
split 4.155041933059692
regex 4.875624895095825
$ python3 test.py
split 4.787220001220703
regex 5.695119857788086
Anyone with Python 3.5 care to join?
Huh, strangely inconclusive.
results from repl.it/languages/python3 (Python 3.4.0)
split 1.2351078800020332
regex 1.3363793969983817
results from ideone.com (Python 2.7.9)
('split', 0.9004449844360352)
('regex', 0.7017428874969482)
and from ideone.com (Python 3.4.3+)
split 1.2050538789480925
regex 1.7611852046102285
Upvotes: 1