PYTHON: Multiprocessing quirks (OR: How do coordinate these threads?)

Question

I have been challenged. I am unsure how to use multiprocessing without jython or cython (or some other IronPython whatsahoosie), and have opted to use Threads for my multicore CentOS program. It reads a set of text files and outputs to a dictionary (defined by hfreq={} on the outside of the defined functions). If I have it sleep, it runs (terribly slowly, seemingly on one core) and works fine.

Additionally, I do not know how to have it wait until both threads are done to actually output to file (other than the sleep.time part, which completely defeats the purpose of speed)

EXAMPLE:

hfreq={}
[INSERT TEXT FILE ARRAYS HERE, RESPECTIVELY filenames0[] and filenames1[]]
def count():
    some code here that writes frequency to hfreq
def count1():
    some code here that writes frequency to hfreq as well, but using filenames1
t1=Thread(target=count,args())
t2=Thread(target=count1,args())
t1.start()
t2.start()
time.sleep(15) #No other known way to prevent the following from running immediately
list=hfreq.items()
list.sort()
Output=Open('Freq.txt', 'w')
[for statement that writes to file]
Output.close()

And that is where it ends. If I run the program with no threading classes (on its own), it gives about 10-14 seconds of runtime. If I try the threading approach (halving the non-threading array between the two threads), I get BOTH THREADS running for 14 seconds (instead of the expected multi-core usage). Thank you for reading this wall of text. Please tell me if I can clarify.

PYTHON: Multiprocessing quirks (OR: How do coordinate these threads?)

Answers (1)

Related Questions