user192314
user192314

Reputation:

Parsing text file in python

So I am trying to python program that will extract the round trip time from a web server ping stored in a text file. So what I basically have is a text file with this:

    PING e11699.b.akamaiedge.net (104.100.153.112) 56(84) bytes of data.
    64 bytes from a104-100-153-112.deploy.static.akamaitechnologies.com (104.100.153.112): icmp_seq=1 ttl=60 time=17.2ms
    64 bytes from a104-100-153-112.deploy.static.akamaitechnologies.com (104.100.153.112): icmp_seq=2 ttl=60 time=12.6ms
    64 bytes from a104-100-153-112.deploy.static.akamaitechnologies.com (104.100.153.112): icmp_seq=3 ttl=60 time=11.7ms
    ... (a bunch more ping responses here)
    --- e11699.b.akamaiedge.net ping statistics ---
    86400 packets transmitted, 86377 received, 0% packet loss, time 86532481ms
    rtt min/avg/max/mdev = 6.281/18.045/1854.971/28.152 ms, pipe 2

I am very new to python and need help being able to use regex commands to extract only the times between "time=" and "ms"and send it to another text file to look like:

11.7
12.6
17.2
...

Any help would be greatly appreciated!

Upvotes: 0

Views: 700

Answers (3)

RFV
RFV

Reputation: 839

You specified that your data is already in a text file. So asuming that your text file is called data.txt

#we will be using the regular expression library for this example
import re 

#open the "data.txt" (named data_file in a scope)
with open("data.txt") as data_file: 
    #read the text from the data_file into ping_data
    ping_data = data_file.read() 
    found_data = re.findall('time=(.*)ms', ping_data)

with open('found.txt', 'w') as found_file:
    for pattern in found_data:
        found_file.write(pattern+"\n")

This fill output a file called found.txt with the following:

17.2
12.6
11.7

In the example we just open your data.txt file. Then read the data form it. Then find all the occurrences of the regular expression pattern that will return the data you are looking for.

time=(.*)ms means *a string of any size between the letters time= and ms

Then after we have found the patern we simply write it to another file called found.txt, writing one line at a time until its complete.

Upvotes: 0

Gilles Quénot
Gilles Quénot

Reputation: 184965

Since this seems to come from command, you could use like this :

grep -oP 'ttl=\d+\s+time=\K[\d\.]+' file    

Output :

17.2
12.6
11.7

Note :

It's very simple to search SO or/and google to use this regex in pure python.

Hint :

Support of \K in regex

Bonus

Because I still have to play with python :

(in a shell) :

python2 <<< $'import re\nf = open("/tmp/file", "r")\nfor textline in f.readlines():\n\tmatches = re.finditer("ttl=\d+\s+time=([\d\.]+)ms", textline)\n\tresults = [float(match.group(1).strip()) for match in matches if len(match.group(1).strip())]\n\tif results:\n\t\tprint results[0]\nf.close()\n'

Upvotes: 5

Sergiy Kolodyazhnyy
Sergiy Kolodyazhnyy

Reputation: 968

Since you asked for Python, here it is:

$ ping -c 4 8.8.8.8 | python -c 'import sys;[ sys.stdout.write(l.split("=")[-1]+"\n") for l in sys.stdin if "time=" in l]'            
10.5 ms

9.22 ms

9.37 ms

9.71 ms

Note, this has stdout buffering, so you may want to add sys.stdout.flush() . Feel free to convert this from one liner to a script

Upvotes: 1

Related Questions