Peter F
Peter F

Reputation: 61

Using parallel in bash script in order to processing grouped input files

I have a bash script which processing each file in some directory:

for (( index=0; index<$COUNT; index++ ))
do
    srcFile=${INCOMING_FILES[$index]}
    ${SCRIPT_PATH}/control.pl ${srcFile} >> ${SCRIPT_PATH}/${LOG_FILE} &
    wait ${!}
    removeIncomingFile ${srcFile}
done

and for few files it works fine but when the number of files is quite large is too slow. I want to use this script parallel to processing grouped files.

Example files:

server_1_1 | server_2_1 | server_3_1
server_1_2 | server_2_2 | server_3_2
server_1_3 | server_2_3 | server_3_3

script should processing files related to each server parallel.
First instance - server_1*
Second instance - server_2*
Third instance - server_3*

Is it possible using GNU Parallel and how it can be reached? Many thanks for each solution!

Upvotes: 0

Views: 56

Answers (2)

Ole Tange
Ole Tange

Reputation: 33685

The grouping part confuses me.

I have the feeling you want them grouped because you do not want to overload the server.

Normally you would simply do:

parallel "control.pl {}; removeIncomingFile {}" ::: incoming/files* > my.log

This will run one job per CPU thread.

Consider spending 20 minutes on reading chapter 1+2 of "GNU Parallel 2018" (printed, online). I think it will help you understand the basic uses of GNU Parallel.

Upvotes: 1

Mark Setchell
Mark Setchell

Reputation: 207445

I can't make head nor tail of what your question is trying to say, but I suspect the following will make a reasonable starting point. You put your actual code inside the '...' instead of the dummy actions I have used:

#!/bin/bash

# Do stuff for server 1
parallel -k 'echo server_1_{} ; date >> log_1_{}' ::: {1..3}

# Do stuff for server 2
parallel -k 'echo server_2_{} ; date >> log_2_{}' ::: {1..3}

# Do stuff for server 3
parallel -k 'echo server_3_{} ; date >> log_3_{}' ::: {1..3}

Sample Output

server_1_1
server_1_2
server_1_3
server_2_1
server_2_2
server_2_3
server_3_1
server_3_2
server_3_3

Log files created

-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_1_1
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_1_2
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_1_3
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_2_1
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_2_2
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_2_3
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_3_1
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_3_2
-rw-r--r--  1 mark  staff     29 30 Oct 21:04 log_3_3

Upvotes: 1

Related Questions