Evan
Evan

Reputation: 1713

Multithreaded access to python dictionary

I am trying to access the same global dictionary from different threads in python simultaneously. Thread safety at the accessing point is not a consern for me since all accesses are reads and dont modify the dictionary. I changed my code to do the accesses from multiple threads but i have noticed no increase in the speed of the execution, after checking arround it seems that the interpreter serializes the accesses in effect making the change in my code null.

Is there an easy way to have a structure like concurrentHashMap of Java in python? The part of the code in question follows:

    class csvThread (threading.Thread):
        def __init__(self, threadID, bizName):
            threading.Thread.__init__(self)
            self.threadID = threadID
            self.bizName = bizName
        def run(self):
            thread_function(self.bizName)



    def thread_function(biz):
        first = True
        bizTempImgMap = {}
        for imag in bizMap[biz]:
            if not similar(bizTempImgMap, imgMap[imag]):
                bizTempImgMap[imag] = imgMap[imag]
                if first:
                    a = imgMap[imag]
                    sum = a
                else:
                    c = np.column_stack((a, imgMap[imag]))
                    sum += imgMap[imag]
                    a = c.max(1) #max
                first = False
            else:
                print ("-")
        csvLock.acquire()
        writer.writerow([biz]+a.astype(np.str_).tolist()+(np.true_divide(sum, len(bizTempImgMap.keys()))).tolist())
        csvLock.release()


csvLock = threading.Lock()
...
imgMap = img_vector_load('test_photos.csv')
bizMap = img_busyness_load('csv/test_photo_to_biz_ids.csv')
...
for biz in bizMap.keys():
    if len(threads)<100:
        thread = csvThread(len(threads), biz)
        threads.append(thread)
        thread.start()
    else:
        print("\nWaiting for threads to finish\n")
        for t in threads:
            t.join()
        print("\nThreads Finished\n")
        threads = []

Upvotes: 1

Views: 1061

Answers (1)

Netwave
Netwave

Reputation: 42786

"i have noticed no increase in the speed of the execution"

No speed increase will be done by using threads in python, since they all work on the same core. Take a look to: GIL

Notice this, python threading should be used for concurrent arquitectures not for speed performance.

In case you want to keep this implementation use multiprocessing.

Upvotes: 3

Related Questions