Kandarp Gandhi
Kandarp Gandhi

Reputation: 279

Correct Approach to Threading in Python

I have a script that takes a text file as input and performs the testing. What I want to do is create two threads and divide the input text file in 2 parts and run them so as to minimize the execution time. Is there a way I can do this ?

Thanks

class myThread (threading.Thread):
    def __init__(self, ip_list):
        threading.Thread.__init__(self)
        self.input_list = ip_list

    def run(self):
        # Get lock to synchronize threads
        threadLock.acquire()
        print "python Audit.py " + (",".join(x for x in self.input_list))
        p = subprocess.Popen("python Audit.py " + (",".join(x for x in self.input_list)), shell=True)
        # Free lock to release next thread
        threadLock.release()
        while p.poll() is None:
            print('Test Execution in Progress ....')
            time.sleep(60)

        print('Not sleeping any longer.  Exited with returncode %d' % p.returncode)


def split_list(input_list, split_count):
    for i in range(0, len(input_list), split_count):
        yield input_list[i:i + split_count]

if __name__ == '__main__':

    threadLock = threading.Lock()
    threads = []

    with open("inputList.txt", "r") as Ptr:       
     for i in Ptr:
         try:
             id = str(i).rstrip('\n').rstrip('\r')
             input_list.append(id)
         except Exception as err:
            print err
            print "Exception occured..."
    try:
      test = split_list(input_list, len(input_list)/THREAD_COUNT)
      list_of_lists = list(test)
    except Exception as err:
      print err
      print "Exception caught in splitting list"

    try:
      #Create Threads & Start
      for i in range(0,len(list_of_lists)-1):
         # Create new threads
         threads.append(myThread(list_of_lists[i]))
         threads[i].start()
         time.sleep(1)

      # Wait for all threads to complete
      for thread in threads:
          thread.join()
      print "Exiting Main Thread..!"
    except Exception as err:
      print err
      print "Exception caught during THREADING..."

Upvotes: 0

Views: 135

Answers (2)

vincedjango
vincedjango

Reputation: 1046

You are trying to do 2 things at the same time, which is the definition of parallelism. The problem here is that if you are using CPython, you won't be able to do parallelism because of the GIL(Global Interpreter Lock). The GIL makes sure that only 1 thread is running because the python interpreter is not considered thread safe.

What you should use if you really want to do two operations in parallel is to use the multiprocessing module (import multiprocessing)

Read this: Multiprocessing vs Threading Python

Upvotes: 1

user1620443
user1620443

Reputation: 794

Some notes, in random order:

In python, multithreading is not a good solution to approach computationally intensive tasks. A better approach is multiprocessing: Python: what are the differences between the threading and multiprocessing modules?

For resources that are not shared (in your case, each line will be used exclusively by a single process) you do not need locks. A better approach would be the map function.

def processing_function(line):
    suprocess.call(["python", "Audit.py", line])

with open('file.txt', 'r') as f:
    lines = f.readlines()

to_process = [lines[:len(lines)//2], lines[len(lines)//2:]]    
p = multiprocessing.Pool(2)
results = p.map(processing_func, to_process)

If the computation requires a variable amount of time depending on the line, using Queues to move data between processes instead of mapping could help to balance the load

Upvotes: 1

Related Questions