Executing bash script with parallel for many directories

Question

If I have a bash script (chunks.sh) that execute several mini scripts in parallel, I was wondering how to properly execute chunks.sh so that it runs in parallel for many folders? I have about 1000 folders with files that need to be processed. Here is my script:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=4
#SBATCH --time=16:00:00
#SBATCH --output=mpi_output_%j.txt
#SBATCH --mail-type=FAIL

cd $SLURM_SUBMIT_DIR

module load gcc
module load gnu-parallel
module load bwa
module load samtools

parallel -j 10 < ../1convertfiles.sh
parallel -j 10 < ../2sortfiles.sh
parallel -j 10 < ../3indexfiles.sh
parallel -j 10 < ../4converttopile.sh
parallel -j 10 < ../5createconsensus.sh
parallel -j 10 < ../6concatenateconsensus.sh

Each folder has a name such as THAKID0001_dir, THAKID0010_dir, etc. So I was wondering how to properly apply a command in this script to loop through my current directory, find all the directories with *_dir attached, then execute all these mini scripts within that directory?

I tried putting my parallel commands into for loops but it was rerunning the mini scripts so many times. I think I can use:

parallel -j 10 < 1convertfiles.sh ::: *_dir/*  
parallel -j 10 < 2sortfiles.sh ::: *_dir/*
etc.

But this logic to me seems that each parallel command block will not be running for the SAME directory at once. Each parallel line will be finding it's own directory to work in and these mini scripts have to run in order, hence why I tried writing a for loop but it was creating a huge mess.

Expected Results:

 $ ./chunks.sh
 ### Should run the list of commands per folder ###
 ### For example, it will execute all the parallel commands in THAK0001_dir then it will execute all the parallel commands in THAK0002_dir, etc ####

TL;DR: How to make chunk.sh execute these parallel commandblocks for all directories with a certain tag (i.e. THAK*_dir) but each line should run once the previous line completed. Hope this made sense..thank you!

Executing bash script with parallel for many directories

Answers (1)

Related Questions