Reputation: 231
I want to extract the string in Description column for each line in the following table. Since the search sting contains spaces and the columns are delimited by spaces , I am not sure how I can parse the right field in each line.
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
------- ------------ ------ ------------ ----------- ----- ------ ----------------- ---- ----------------------------------------------------------------
vmnic0 0000:3d:00.0 i40en Up Down 0 Half 00:00:00:00:03:14 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic1 0000:3d:00.1 i40en Up Down 0 Half 00:00:00:00:03:15 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic10 0000:d9:00.1 ixgben Up Down 0 Half a0:36:9f:d9:b9:11 1500 Intel(R) Ethernet Controller 10G X550
vmnic11 0000:01:00.0 i40en Up Down 0 Half 3c:fd:fe:a9:4e:b8 1500 Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic12 0000:01:00.1 i40en Up Up 10000 Full 3c:fd:fe:a9:4e:b9 1500 Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic2 0000:00:1f.6 ne1000 Up Down 0 Half 88:88:88:88:87:88 1500 Intel Corporation Ethernet Connection (3) I219-LM
vmnic3 0000:3d:00.2 i40en Up Down 0 Half 00:00:00:00:03:16 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic4 0000:3d:00.3 i40en Up Down 0 Half 00:00:00:00:03:17 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic5 0000:18:00.0 ixgben Up Down 0 Half 90:e2:ba:37:50:a8 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic6 0000:18:00.1 ixgben Up Down 0 Half 90:e2:ba:37:50:a9 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic7 0000:81:00.0 ixgben Up Up 10000 Full 90:e2:ba:1e:b6:24 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic8 0000:81:00.1 ixgben Up Down 0 Half 90:e2:ba:1e:b6:25 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic9 0000:d9:00.0 ixgben Up Up 1000 Full a0:36:9f:d9:b9:10 1500 Intel(R) Ethernet Controller 10G X550
Upvotes: 1
Views: 72
Reputation: 57
I suppose you can get each line in a string.
>>> s = "vmnic0 0000:3d:00.0 i40en Up Down 0 Half 00:00:00:00:03:14 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+"
>>> row = re.split(r"\s{2,}", s)
>>> description = row[-1]
Upvotes: 1
Reputation: 519
Using pandas
:
from io import StringIO
import pandas as pd
TESTDATA = StringIO("""
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
------- ------------ ------ ------------ ----------- ----- ------ ----------------- ---- ----------------------------------------------------------------
vmnic0 0000:3d:00.0 i40en Up Down 0 Half 00:00:00:00:03:14 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic1 0000:3d:00.1 i40en Up Down 0 Half 00:00:00:00:03:15 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic10 0000:d9:00.1 ixgben Up Down 0 Half a0:36:9f:d9:b9:11 1500 Intel(R) Ethernet Controller 10G X550
vmnic11 0000:01:00.0 i40en Up Down 0 Half 3c:fd:fe:a9:4e:b8 1500 Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic12 0000:01:00.1 i40en Up Up 10000 Full 3c:fd:fe:a9:4e:b9 1500 Intel(R) Ethernet Controller XXV710 for 25GbE SFP28
vmnic2 0000:00:1f.6 ne1000 Up Down 0 Half 88:88:88:88:87:88 1500 Intel Corporation Ethernet Connection (3) I219-LM
vmnic3 0000:3d:00.2 i40en Up Down 0 Half 00:00:00:00:03:16 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic4 0000:3d:00.3 i40en Up Down 0 Half 00:00:00:00:03:17 1500 Intel(R) Ethernet Connection X722 for 10GbE SFP+
vmnic5 0000:18:00.0 ixgben Up Down 0 Half 90:e2:ba:37:50:a8 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic6 0000:18:00.1 ixgben Up Down 0 Half 90:e2:ba:37:50:a9 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic7 0000:81:00.0 ixgben Up Up 10000 Full 90:e2:ba:1e:b6:24 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic8 0000:81:00.1 ixgben Up Down 0 Half 90:e2:ba:1e:b6:25 1500 Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection
vmnic9 0000:d9:00.0 ixgben Up Up 1000 Full a0:36:9f:d9:b9:10 1500 Intel(R) Ethernet Controller 10G X550
""")
df = pd.read_csv(TESTDATA, sep="\s{2,}").iloc[1:]
descriptions = [x for x in df['Description']]
And the output:
['Intel(R) Ethernet Connection X722 for 10GbE SFP+',
'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
'Intel(R) Ethernet Controller 10G X550',
'Intel(R) Ethernet Controller XXV710 for 25GbE SFP28',
'Intel(R) Ethernet Controller XXV710 for 25GbE SFP28',
'Intel Corporation Ethernet Connection (3) I219-LM',
'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
'Intel(R) Ethernet Connection X722 for 10GbE SFP+',
'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
'Intel Corporation 82599EB 10-Gigabit SFI/SFP+ Network Connection',
'Intel(R) Ethernet Controller 10G X550']
Upvotes: 1
Reputation: 141
It seems your delimiter is "more than one space". The regular expression for that would be \s{2,}
.
So for each line here, description = re.split('\s{2,}', line)[-1]
Upvotes: 1