Reputation: 279
I have a script that takes a text file as input and performs the testing. What I want to do is create two threads and divide the input text file in 2 parts and run them so as to minimize the execution time. Is there a way I can do this ?
Thanks
class myThread (threading.Thread):
def __init__(self, ip_list):
threading.Thread.__init__(self)
self.input_list = ip_list
def run(self):
# Get lock to synchronize threads
threadLock.acquire()
print "python Audit.py " + (",".join(x for x in self.input_list))
p = subprocess.Popen("python Audit.py " + (",".join(x for x in self.input_list)), shell=True)
# Free lock to release next thread
threadLock.release()
while p.poll() is None:
print('Test Execution in Progress ....')
time.sleep(60)
print('Not sleeping any longer. Exited with returncode %d' % p.returncode)
def split_list(input_list, split_count):
for i in range(0, len(input_list), split_count):
yield input_list[i:i + split_count]
if __name__ == '__main__':
threadLock = threading.Lock()
threads = []
with open("inputList.txt", "r") as Ptr:
for i in Ptr:
try:
id = str(i).rstrip('\n').rstrip('\r')
input_list.append(id)
except Exception as err:
print err
print "Exception occured..."
try:
test = split_list(input_list, len(input_list)/THREAD_COUNT)
list_of_lists = list(test)
except Exception as err:
print err
print "Exception caught in splitting list"
try:
#Create Threads & Start
for i in range(0,len(list_of_lists)-1):
# Create new threads
threads.append(myThread(list_of_lists[i]))
threads[i].start()
time.sleep(1)
# Wait for all threads to complete
for thread in threads:
thread.join()
print "Exiting Main Thread..!"
except Exception as err:
print err
print "Exception caught during THREADING..."
Upvotes: 0
Views: 135
Reputation: 1046
You are trying to do 2 things at the same time, which is the definition of parallelism. The problem here is that if you are using CPython, you won't be able to do parallelism because of the GIL(Global Interpreter Lock). The GIL makes sure that only 1 thread is running because the python interpreter is not considered thread safe.
What you should use if you really want to do two operations in parallel is to use the multiprocessing module (import multiprocessing)
Read this: Multiprocessing vs Threading Python
Upvotes: 1
Reputation: 794
Some notes, in random order:
In python, multithreading is not a good solution to approach computationally intensive tasks. A better approach is multiprocessing: Python: what are the differences between the threading and multiprocessing modules?
For resources that are not shared (in your case, each line will be used exclusively by a single process) you do not need locks. A better approach would be the map function.
def processing_function(line):
suprocess.call(["python", "Audit.py", line])
with open('file.txt', 'r') as f:
lines = f.readlines()
to_process = [lines[:len(lines)//2], lines[len(lines)//2:]]
p = multiprocessing.Pool(2)
results = p.map(processing_func, to_process)
If the computation requires a variable amount of time depending on the line, using Queues to move data between processes instead of mapping could help to balance the load
Upvotes: 1