Reputation: 1713
I am trying to access the same global dictionary from different threads in python simultaneously. Thread safety at the accessing point is not a consern for me since all accesses are reads and dont modify the dictionary. I changed my code to do the accesses from multiple threads but i have noticed no increase in the speed of the execution, after checking arround it seems that the interpreter serializes the accesses in effect making the change in my code null.
Is there an easy way to have a structure like concurrentHashMap of Java in python? The part of the code in question follows:
class csvThread (threading.Thread):
def __init__(self, threadID, bizName):
threading.Thread.__init__(self)
self.threadID = threadID
self.bizName = bizName
def run(self):
thread_function(self.bizName)
def thread_function(biz):
first = True
bizTempImgMap = {}
for imag in bizMap[biz]:
if not similar(bizTempImgMap, imgMap[imag]):
bizTempImgMap[imag] = imgMap[imag]
if first:
a = imgMap[imag]
sum = a
else:
c = np.column_stack((a, imgMap[imag]))
sum += imgMap[imag]
a = c.max(1) #max
first = False
else:
print ("-")
csvLock.acquire()
writer.writerow([biz]+a.astype(np.str_).tolist()+(np.true_divide(sum, len(bizTempImgMap.keys()))).tolist())
csvLock.release()
csvLock = threading.Lock()
...
imgMap = img_vector_load('test_photos.csv')
bizMap = img_busyness_load('csv/test_photo_to_biz_ids.csv')
...
for biz in bizMap.keys():
if len(threads)<100:
thread = csvThread(len(threads), biz)
threads.append(thread)
thread.start()
else:
print("\nWaiting for threads to finish\n")
for t in threads:
t.join()
print("\nThreads Finished\n")
threads = []
Upvotes: 1
Views: 1061
Reputation: 42786
"i have noticed no increase in the speed of the execution"
No speed increase will be done by using threads in python, since they all work on the same core. Take a look to: GIL
Notice this, python threading should be used for concurrent arquitectures not for speed performance.
In case you want to keep this implementation use multiprocessing.
Upvotes: 3