Igor
Igor

Reputation: 426

Got Exception Error “Exception in thread Thread-13 (most likely raised during interpreter shutdown)”

I wrote a simple script, which using threads to retrieve data from service.

    __author__ = 'Igor'
import requests
import time
from multiprocessing.dummy import Pool as ThreadPool

ip_list = []
good_ip_list = []
bad_ip_list = []
progress = 0

with open('/tmp/ip.txt') as f:
    ip_list = f.read().split()

def process_request(ip):
    global progress
    progress += 1
    if progress % 10000 == 0:
        print 'Processed ip:', progress, '...'
    r = requests.get('http://*****/?ip='+ip, timeout=None)
    if r.status_code == 200:
        good_ip_list.append(ip)
    elif r.status_code == 400:
        bad_ip_list.append(ip)
    else:
        print 'Unknown http code received, aborting'
        exit(1)

pool = ThreadPool(16)
try:
    pool.map(process_request, ip_list)
except:
    for name, ip_list in (('/tmp/out_good.txt', good_ip_list),     ('/tmp/out_bad.txt', bad_ip_list)):
        with open(name, 'w') as f:
            for ip in ip_list:
                print>>f, ip

But after some requests processed (40k-50k) i receive:

Exception in thread Thread-7 (most likely raised during interpreter shutdown): Traceback (most recent call last): Process finished with exit code 0

Tried to change service settings:

        <timeout>999</timeout>
        <connectionlimit>600</connectionlimit>
        <httpthreads>32</httpthreads>
        <workerthreads>128</workerthreads>

but still same error. Can anybody help me - what's wrong?

Upvotes: 6

Views: 6140

Answers (2)

Igor
Igor

Reputation: 426

Thanks to everybody, who helped me in solving this problem. Rewrote the whole code and now it works perfectly:

__author__ = 'kulakov'
import requests
import time
from multiprocessing.dummy import Pool as ThreadPool

ip_list = []
good_ip_list = []
bad_ip_list = []

with open('/tmp/ip.txt') as f:
    ip_list = f.read().split()

s = requests.Session()
def process_request(ip):
    r = s.get('http://*****/?ip='+ip, timeout=None)
    if r.status_code == 200:
        # good_ip_list.append(ip)
        return (ip, True)
    elif r.status_code == 400:
        # bad_ip_list.append(ip)
        return (ip, False)
    else:
        print 'Unknown http code received, aborting'
        exit(1)

pool = ThreadPool(16)
for ip, isOk in pool.imap(process_request, ip_list):
    if isOk:
        good_ip_list.append(ip)
    else:
        bad_ip_list.append(ip)
pool.close()
pool.join()

for name, ip_list in (('/tmp/out_good.txt', good_ip_list),    ('/tmp/out_bad.txt', bad_ip_list)):
    with open(name, 'w') as f:
        for ip in ip_list:
            print>>f, ip

Some new usefull information:

1) It was really bad idea to write data in different threads in a function process_request, now it returns statement(true\false) and ip.

2) keep alive is fully supported by requests, by default, but if you want to use it, you must create instance of an object Session, and apply get method to it only:

s = requests.Session()
r = s.get('http://*****/?ip='+ip, timeout=None)

Upvotes: 5

Patrick Collins
Patrick Collins

Reputation: 10584

This:

good_ip_list = []
bad_ip_list = []

is not safe to mix with Python multiprocessing. The correct approach is to return a tuple (or something) from each call to process_request and then concatenate them all at the end. It's also not safe to modify progress concurrently from multiple processes. I'm not positive what your error is, but I bet it's some synchronization problem that is killing Python as a whole.

Remove the shared state and try again.

Upvotes: 1

Related Questions