Excalibur
Excalibur

Reputation: 441

Get webpage data every 5 seconds and run a given times

I want to get the source code of a web page for every 5 seconds, and I want to do it for multiple times. Such as 3 times, then the total time is 3*5 = 15 seconds. I wrote the following code:

import urllib2
import threading

def getWebSource(url):
    usock = urllib2.urlopen(url)
    data = usock.read()
    usock.close()

    with open("WebData.txt", "a") as myfile:
        myfile.write(data)
        myfile.write("\n\n")



url = 'http://www.google.com/'
n = 3
while n>0:
    t = threading.Timer(5.0, getWebSource,args = [url]) # set the seconds here
    t.start()
    n = n-1

However, when I run it, what I get is: it only run 5 seconds and read the webpage for 3 times. What is wrong with it? I was expect that it should read the webpage every 5 seconds and repeat it for 3 times.

#

Update:Thanks for @wckd , here is my final code:

    import urllib2
    import time
    from time import gmtime, strftime

    def getWebSource(url,fout,seconds,times):
        while times > 0:
            usock = urllib2.urlopen(url)
            data = usock.read()
            usock.close()

            currentTime = strftime("%Y-%m-%d %H:%M:%S", gmtime())
            fout.write(currentTime +"\n")
            fout.write(data)
            fout.write("\n\n")
            times = times - 1
            time.sleep(seconds)



url = 'http://www.google.com'
fout = open("WebData.txt", "a")
seconds = 5
times = 4

getWebSource(url,fout,seconds,times)

Upvotes: 2

Views: 4301

Answers (1)

wckd
wckd

Reputation: 410

The method threading.Timer() simply creates a thread that starts after the designated amount of time. While that thread is waiting to run the loop continues to run. Basically you have three threads that all will get run after the 5 seconds.

If you want to have the spacing you could make getWebSource a recursive function with a countdown that will kick off a new thread when it gets run. Or if you want to keep doing what your doing you could multiply the 5 by n to get a spacing. I don't recommend this because if you try for 100 times you'll have 100 threads.

UPDATE

The easiest way to do this in one thread is to add a wait call (also known as sleep) to your loop

while n > 0
time.sleep(5)
yourMethodHere()
end

However as your method will take time to run keep the thread creation in there and set it to wait 0 seconds.

while n > 0
time.sleep(5)
threading.Timer(0, yourMethodHere())
n = n - 1
end

This way you won't be confined by a bad connection or something slowing you up.

Upvotes: 1

Related Questions