user2511798
user2511798

Reputation: 107

Thread memory usage keeps increasing

I am trying to visit the webpages and check if the website owner allows to contact him or not..

Here is http://pastebin.com/12rLXQaz

This is the function that each thread calls:

def getpage():
    try:
        curl = urls.pop(0)
        print "working on " +str(curl)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass
    finally:
        if len(urls)>0 :
            getpage()  

But the thing is memory of program keep on getting increased.. (pythonw.exe)

As the thread calling the function again the condition is true .. the memory of the program should stay at least approximately at the same level.

For a list containing about 100k URLs, the program is taking much more than 3GB and increasing...

Upvotes: 2

Views: 2302

Answers (2)

Lennart Regebro
Lennart Regebro

Reputation: 172229

Your program is recursive for no reason. The recursion means that for each page you get you create a new set of variables, and since these are still being referenced by the local variables in the function, since the function never ends, the garbage collection never comes into play, and it will continue to eat memory for ever.

Read up on the while statement, it's the one you want to use instead of recursion here.

while len(urls)>0 :
    try:
        curl = urls.pop(0)
        thepage1 = requests.get(curl).text
        global ctot
        if "Contact Us" in thepage1:
            slist.write("\n" +curl)
            ctot = ctot + 1
    except:
        pass

Upvotes: 3

User
User

Reputation: 14863

I had a look at your code: http://pastebin.com/J4Rd3NhA

I would use join while 100 threads run:

for xd in range(0,noofthreads):
    t = threading.Thread(target=getpage)
    t.daemon = True
    t.start()
    tarray.append(t)
    # my additional code
    if len(tarray) >= 100:
        tarray[-100].join()

How does this perform? If something is wrong, tell me.

Upvotes: -1

Related Questions