dman
dman

Reputation: 11064

Python Threading and interpreter shutdown- Is this fixable or issue Python Issue #Issue14623

I have python script that uploads files to a cloud account. It was working for a while, but out of now where I started getting the 'Exception in thread Thread-1 (most likely raised during interpreter shutdown)' error. After researching I found this python issue http://bugs.python.org/issue14623 which states the issue will not get fixed.

However, I'm not exactly sure this would apply to me and I am hoping someone could point out a fix. I would like to stay with python's threading and try to avoid using multiprocessing since this is I/O bound. This is the stripped down version(which has this issue also), but in the full version the upload.py has a list I like to share so I want it to run in the same memory.

It always breaks only after it completes and all the files are uploaded. I tried removing 't.daemon = True' and it will just hang(instead of breaking) at that same point(after all the files are uploaded). I also tried removing q.join() along with 't.daemon = True' and it will just hang after completion. Without the t.daemon = True and q.join(), I think it is blocking at item = q.get() when it comes to the end of the script execution(just guess).

main:

 import logging 
import os 
import sys
import json
from os.path import expanduser
from Queue import Queue
from threading import Thread
from auth import Authenticate
from getinfo import get_containers, get_files, get_link
from upload import upload_file 
from container_util import create_containers
from filter import MyFilter

home = expanduser("~") + '/'

directory = home + "krunchuploader_logs"

if not os.path.exists(directory):
    os.makedirs(directory)

debug = directory + "/krunchuploader__debug_" + str(os.getpid())
error = directory + "/krunchuploader__error_" + str(os.getpid())
info = directory + "/krunchuploader__info_" + str(os.getpid())

os.open(debug, os.O_CREAT | os.O_EXCL)
os.open(error, os.O_CREAT | os.O_EXCL)
os.open(info, os.O_CREAT | os.O_EXCL)

formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')


logging.basicConfig(level=logging.DEBUG,
    format='%(asctime)s - %(levelname)s - %(message)s',
    filename=debug,
    filemode='w')

logger = logging.getLogger("krunch")

fh_error = logging.FileHandler(error)
fh_error.setLevel(logging.ERROR)
fh_error.setFormatter(formatter)
fh_error.addFilter(MyFilter(logging.ERROR))

fh_info = logging.FileHandler(info)
fh_info.setLevel(logging.INFO)
fh_info.setFormatter(formatter)
fh_info.addFilter(MyFilter(logging.INFO))

std_out_error = logging.StreamHandler()
std_out_error.setLevel(logging.ERROR)
std_out_info = logging.StreamHandler()
std_out_info.setLevel(logging.INFO)

logger.addHandler(fh_error)
logger.addHandler(fh_info)
logger.addHandler(std_out_error)
logger.addHandler(std_out_info)



def main():
    sys.stdout.write("\x1b[2J\x1b[H")
    print title
    authenticate = Authenticate()
    cloud_url = get_link(authenticate.jsonresp)
    #per 1 million files the list will take
    #approx 300MB of memory.
    file_container_list, file_list = get_files(authenticate, cloud_url)
    cloud_container_list = get_containers(authenticate, cloud_url)
    create_containers(cloud_container_list,
        file_container_list, authenticate, cloud_url)

    return file_list

def do_the_uploads(file_list):
    def worker():
    while True:
        item = q.get()
        upload_file(item)
        q.task_done()

    q = Queue()
    for i in range(5):
    t = Thread(target=worker)
    t.daemon = True
    t.start()

    for item in file_list:
    q.put(item)

    q.join()


if __name__ == '__main__':
    file_list = main()
    value = raw_input("\nProceed to upload files? Enter [Y/y] for yes: ").upper()
    if value == "Y":
    do_the_uploads(file_list)

upload.py:

def upload_file(file_obj):
    absolute_path_filename, filename, dir_name, token, url = file_obj
    url = url + dir_name + '/' + filename

    header_collection = {
        "X-Auth-Token": token}

    print "Uploading " + absolute_path_filename
    with open(absolute_path_filename) as f:
    r = requests.put(url, data=f, headers=header_collection)
    print "done"    

Error output:

Fetching Cloud Container List... Got it!
All containers exist, none need to be added

Proceed to upload files? Enter [Y/y] for yes: y
Uploading /home/one/huh/one/green
Uploading /home/one/huh/one/red
Uploading /home/one/huh/two/white
Uploading /home/one/huh/one/blue
 Uploading /home/one/huh/two/yellow
done
Uploading /home/one/huh/two/purple
done
done
done
done
done
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
Traceback (most recent call last):
  File "/usr/lib64/python2.7/threading.py", line 808, in __bootstrap_inner
  File "/usr/lib64/python2.7/threading.py", line 761, in run
  File "krunchuploader.py", line 97, in worker
  File "/usr/lib64/python2.7/Queue.py", line 168, in get
  File "/usr/lib64/python2.7/threading.py", line 332, in wait
<type 'exceptions.TypeError'>: 'NoneType' object is not callable

UPDATE: I placed a time.sleep(2) at the end of the script which seems to have fixed the issue. I guess the sleep allows the daemons to finish before the script comes to end of life and closes? I would of thought the main process would have to wait for the daemons to finish.

Upvotes: 0

Views: 679

Answers (1)

Janne Karila
Janne Karila

Reputation: 25207

You can use a "poison pill" to kill the workers gracefully. After putting all the work in the queue, add a special object, one per worker, that workers recognize and quit. You can make the threads non-daemonic so Python will wait for them to finish before shutting down the process.

A concise way to make worker recognize poison and quit is to use the two-argument form of the iter() builtin in a for loop:

def do_the_uploads(file_list):
    def worker():
        for item in iter(q.get, poison):
            upload_file(item)

    poison = object()
    num_workers = 5
    q = Queue()
    for i in range(num_workers):
        t = Thread(target=worker)
        t.start()

    for item in file_list:
        q.put(item)

    for i in range(num_workers):
        q.put(poison)

Upvotes: 2

Related Questions