Bill Cheatham
Bill Cheatham

Reputation: 11947

Parallel processing in shell scripting, 'pid is not a child of this shell'

I have a question about parallel processing in shell scripting. I have a program my Program, which I wish to run multiple times, in a loop within a loop. This program is basically this:

MYPATHDIR=`ls $MYPATH`
for SUBDIRS in $MYPATHDIR; do
  SUBDIR_FILES=`ls $MYPATH/$SUBDIRS`
  for SUBSUBDIRS in $SUBDIR_FILES; do
    find $MYPATH/$SUBDIRS/$SUBSUBDIRS | ./myProgram $MYPATH/$SUBDIRS/outputfile.dat
  done
done

What I wish to do is to take advantage of parallel processing. So I tried this for the middle line to start all the myPrograms at once:

(find $MYPATH/$SUBDIRS/$SUBSUBDIRS | ./myProgram $MYPATH/$SUBDIRS/outputfile.dat &)

However, this began all 300 or so calls to myProgram simultaneously, causing RAM issues etc.

What I would like to do is to run each occurrence of myProgram in the inner loop in parallel, but wait for all of these to finish before moving on to the next outer loop iteration. Based on the answers to this question, I tried the following:

for SUBDIRS in $MYPATHDIR; do
  SUBDIR_FILES=`ls $MYPATH/$SUBDIRS`
  for SUBSUBDIRS in $SUBDIR_FILES; do
    (find $MYPATH/$SUBDIRS/$SUBSUBDIRS | ./myProgram $MYPATH/$SUBDIRS/outputfile.dat &)
  done
  wait $(pgrep myProgram)   
done

But I got the following warning/error, repeated multiple times:

./myScript.sh: line 30: wait: pid 1133 is not a child of this shell

...and all the myPrograms were started at once, as before.

What am I doing wrong? What can I do to achieve my aims? Thanks.

Upvotes: 2

Views: 4621

Answers (3)

milahu
milahu

Reputation: 3609

to wait for a non-child process, you can watch the proc filesystem

while [ -e /proc/$pid ]; do sleep 1; done

this can produce false positives if the pid process terminates
and another process immediately takes the same pid

fix: also check the process start time

_wait() {
  # wait for non-child process
  local pid=$1
  # process start time
  local pst=$(stat -c%X /proc/$pid 2>/dev/null || true)
  [ -z "$pst" ] && return
  while [ "$(stat -c%X /proc/$pid 2>/dev/null || true)" == $pst ]; do sleep 1; done
}

_wait 12345

Upvotes: 1

ephemient
ephemient

Reputation: 205014

You may find GNU Parallel useful.

parallel -j+0 ./myProgram ::: $MYPATH/$SUBDIRS/*

This will run as many as ./myProgram as CPU cores in parallel.

Upvotes: 2

Marc B
Marc B

Reputation: 360872

() invokes a subshell, which then invokes find/myprogram, so you're dealing with "grandchildren" processes. You can't wait on grandchildren, only direct descendants (aka children).

Upvotes: 4

Related Questions