Reputation: 1
I have been challenged. I am unsure how to use multiprocessing without jython or cython (or some other IronPython whatsahoosie), and have opted to use Threads for my multicore CentOS program. It reads a set of text files and outputs to a dictionary (defined by hfreq={} on the outside of the defined functions). If I have it sleep, it runs (terribly slowly, seemingly on one core) and works fine.
Additionally, I do not know how to have it wait until both threads are done to actually output to file (other than the sleep.time part, which completely defeats the purpose of speed)
EXAMPLE:
hfreq={}
[INSERT TEXT FILE ARRAYS HERE, RESPECTIVELY filenames0[] and filenames1[]]
def count():
some code here that writes frequency to hfreq
def count1():
some code here that writes frequency to hfreq as well, but using filenames1
t1=Thread(target=count,args())
t2=Thread(target=count1,args())
t1.start()
t2.start()
time.sleep(15) #No other known way to prevent the following from running immediately
list=hfreq.items()
list.sort()
Output=Open('Freq.txt', 'w')
[for statement that writes to file]
Output.close()
And that is where it ends. If I run the program with no threading classes (on its own), it gives about 10-14 seconds of runtime. If I try the threading approach (halving the non-threading array between the two threads), I get BOTH THREADS running for 14 seconds (instead of the expected multi-core usage). Thank you for reading this wall of text. Please tell me if I can clarify.
Upvotes: 0
Views: 329
Reputation: 227
If you want to take advantage of multiple cores with CPython, you should use the multiprocessing
module: it has many caveats but this is the sort of problem it's a relatively good fit for.
To wait until a thread is done, use t.join()
.
Upvotes: 1