Reputation: 23
I have a list of strings that I pulled from a text file. I need to read each line and "select" two specific parts. Here is an example line from the text file (firewall report):
2011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for Workstations:192.168.2.85/1440 to Servers:192.168.1.6/43032 duration 0:00:00 bytes 2093 TCP FINs
I need to save the IP address that comes after "Workstations:" and know that they are the "workstation IPs" and I need to save the server IPs as such as well.
I imagine the best technique would be to create two lists, one for workstation IPs and one for server IPs, and read each line and write the IPs to their respective lists.
But in order to do that I need to select them, which I might do like this:
workstationIPs = []
serverIPs = []
for line in report:
workstationIPs.append(line[a:b])
serverIPs.append(line[c:d])
With 'a' being the start of the workstation IP and 'b' being the end (and 'c' and 'd' relating to server IPs).
However, not all the lines are the same length, so that method of selection won't work. Does anyone have any ideas on how to extract those two strings from the line?
PS: this is my first question, so please let me know of errors and I can resubmit it. Thanks!)
Upvotes: 2
Views: 226
Reputation: 2556
You can use str.partition to split the string up and get the parts you want:
workstation_ip = line.partition('Workstations:')[2].partition('/')[0]
server_ip = line.partition('Servers:')[2].partition('/')[0]
To avoid repetition, make a function:
def between(line, preceding, following):
return line.partition(preceding)[2].partition(following)[0]
...
workstation_ip = between(line, 'Workstations:', '/')
server_ip = between(line, 'Servers:', '/')
Upvotes: 1
Reputation: 8786
This is one way you could do it, using split and list comp:
str = "2011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for **Workstations:192.168.2.85/1440** to **Servers:192.168.1.6/43032** duration 0:00:00 bytes 2093 TCP FINs"
workstationIPs = [item.split(':')[1].replace("**", "").split("/")[0] for item in str.split(' ') if "**Workstations:" in item]
serverIPs = [item.split(':')[1].replace("**", "").split("/")[0] for item in str.split(' ') if "**Servers:" in item]
print workstationIPs
print serverIPs
Or with regex and list comp:
import re
str = "2011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for **Workstations:192.168.2.85/1440** to **Servers:192.168.1.6/43032** duration 0:00:00 bytes 2093 TCP FINs"
workstationIPs = [re.findall(r'[0-9]+(?:\.[0-9]+){3}', item)[0] for item in str.split(' ') if "**Workstations:" in item]
serverIPs = [re.findall(r'[0-9]+(?:\.[0-9]+){3}', item)[0] for item in str.split(' ') if "**Servers:" in item]
print workstationIPs
print serverIPs
Both yield:
['192.168.2.85']
['192.168.1.6']
Upvotes: 0
Reputation: 931
If the number of spaces is consistent, you could try this, which splits on whitespace, removes the astrisks, and takes the content after the first colon
workstationIPs = []
serverIPs = []
for line in report:
items = line.split()
workstationIPs.append(items[14].strip('*').split(':')[1])
serverIPs.append(items[16].strip('*').split(':')[1])
Upvotes: 0
Reputation: 5515
use regex!
import re
workstationIPs = []
serverIPs = []
for line in report:
workstationIPs.append(re.search(r'Workstations:((?:\d{1,3}\.){3}\d{1,3})',line).group(1))
serverIPs.append(r're.search(Servers:((?:\d{1,3}\.){3}\d{1,3})',line).group(1))
example:
>>> s = '011-04-13 08:52:55 Local4.Info 192.168.1.1 :Apr 13 08:52:55 PDT: %ASA-session-6-302014: Teardown TCP connection 41997800 for **Workstations:192.168.2.85/1440** to **Servers:192.168.1.6/43032** duration 0:00:00 bytes 2093 TCP FINs'
>>> re.search(r'Workstations:((?:\d{1,3}\.){3}\d{1,3})',s).group(1)
'192.168.2.85'
Upvotes: 1