Reputation: 143
My Requirement is to run a shell function or script in parallel with multi-processing. Currently I get it done with the below script that doesn't use multi-processing. Also when I start 10 jobs in parallel, one job might get completed early and has to wait for the other 9 jobs to complete. I wanted eliminate this with the help of multiprocessing in python.
i=1
total=`cat details.txt |wc -l`
while [ $i -le $total ]
do
name=`cat details.txt | head -$i | tail -1 | awk '{print $1}'
age=`cat details.txt | head -$i | tail -1 | awk '{print $2}'
./new.sh $name $age &
if (( $i % 10 == 0 )); then wait; fi
done
wait
I want to run ./new.sh $name $age
inside a python script with multiprocessing enabled(taking into account the number of cpu) As you can see the value of $name and $age would change in each execution. Kindly share your thoughts
Upvotes: 2
Views: 5470
Reputation: 69022
First, your whole schell script could be replaced with:
awk '{ print $1; print $2; }' details.txt | xargs -d'\n' -n 2 -P 10 ./new.sh
A simple python solution would be:
from subprocess import check_call
from multiprocessing.dummy import Pool
def call_script(args):
name, age = args # unpack arguments
check_call(["./new.sh", name, age])
def main():
with open('details.txt') as inputfile:
args = [line.split()[:2] for line in inputfile]
pool = Pool(10)
# pool = Pool() would use the number of available processors instead
pool.map(call_script, args)
pool.close()
pool.join()
if __name__ == '__main__':
main()
Note that this uses multiprocessing.dummy.Pool
(a thread pool) to call the external script, which in this case is preferable to a process pool, since all the call_script
method does is invoke the script and wait for its return. Doing that in a worker process instead of a worker thread wouldn't increase performance since this is an IO based operation. It would only increase the overhead for process creation and interprocess communication.
Upvotes: 4