Reputation: 1
I have a data that i got from analysing tcpdump file. the result is below.
First column is time, follow by src mac, dest_mac, src_ip & src_port and dest_ip_dest_ip.
I have data from one source ip to a destination ip which appears in different rows, only with the same information except little different in time. Instead of displaying all this information, i will like to loop through the file, if destination ip is the same, record the start time and the end time, the take the difference and print just one row with the difference.
My result at the moment
03-23 00:37:28.174515 | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 49671 | 180.149.153.11 | 80
03-23 00:37:28.174536 | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 49671 | 180.149.153.11 | 80
03-23 00:41:36.422588 | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 49672 | 180.149.153.11 | 80
03-23 00:44:18.584080 | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 49671 | 180.149.153.11 | 80
03-23 00:44:22.588592 | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 35660 | 180.149.134.61 | 80
03-23 00:45:12.636571 | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 35661 | 180.149.134.61 | 80
what I expect:
(00:44:22 - 00:37:28) | 8ca982044d00 | c04a00332142 | 192.168.1.100 | 35661 | 180.149.134.61 | 80
I dont expect you to write the code for me, but a little bit of hint will be so helpful
Upvotes: 0
Views: 110
Reputation: 12092
So this is how you can structure your data to compute the difference:
Keep a dictionary of the network. This can basically be a dictionary of a dictionary with the source ip being the key for the outer dictionary and destination ip being the value for outer key and a key to the inner dictionary. The value for the inner dictionary can be a list with start time as its first element and end time as its second element for this source ip destination ip pair. i.e.,
network[src][dest] = [start_time, end_time]
Use the csv module to parse the input file and build the network as above.
Once you have the network, you have the start and end date/times. You can use the strprime()
and strftime()
methods to calculate the difference.
Then you use the csv module to write the data to the output file
This code below goes through the data you gave above and constructs the network dictionary of dictionary that I explained:
from pprint import pprint
import csv
network = {}
with open('file1') as input_file:
csv_reader = csv.reader(input_file, delimiter='|')
for row in csv_reader:
src = row[3].strip()
dest = row[5].strip()
if src not in network:
network[src] = {}
elif dest not in network[src]:
network[src][dest] = [row[0].strip(), row[0].strip()]
else:
network[src][dest][4] = row[0].strip()
pprint(network)
This gives:
{'192.168.1.100': {'180.149.134.61': ['03-23 00:44:22.588592',
'03-23 00:45:12.636571'],
'180.149.153.11': ['03-23 00:37:28.174536',
'03-23 00:44:18.584080']}}
Now that you have the source and destination ips organized, the other steps are straight forward.
Hope this helps.
Upvotes: 1