Reputation: 1

Progress Bar not displaying in GNU Parallel with SLURM script

I am new to GNU Parallel and am trying to run a few simulations. I have a bash script which I am submitting to a cluster via SLURM. The script is given below. Essentially, the parallel calls a function run_simulation, which will call bash scripts inside it. The bash scripts generate output in the current directory, which is different for each job.

#!/bin/bash
# Job name:
#SBATCH --job-name=Run_MD_Sim
#
# Account:
#SBATCH --account=fc_mllam
#
# Partition:
#SBATCH --partition=savio3
#
# Request one node:
#SBATCH --nodes=1
#
# Specify number of tasks for use case (example):
#
#
# Processors per task:
#SBATCH --cpus-per-task=2
#
# Wall clock limit:
#SBATCH --time=5:00:30
#
## Command(s) to run (example):

module load intel
module load openmpi
module load gcc
module load cmake
module load gnu-parallel/2019.03.22

energy_list=("90")
fluence_list=("1000")

len_energy=${#energy_list[@]}
len_fluence=${#fluence_list[@]}

# Change this line if number of nodes requested is changed
val="ALE_Cycle_Run_2.sh"

# Function to run MD simulation for a single combination of energy and fluence
run_simulation() {
    enval="$1"
    flval="$2"
    counter="$3"
    val="$4"
    
    # Create a directory to carry out computations. If node=1, then we are in fcmd_bondorder
    mkdir "Temp_Directory_$counter"
    
    # Check if using more than one node. If more than one node is used, then working directory will be the home directory. Below lines will change
    cp ../temp_000588-322.cfg "Temp_Directory_$counter/temp_000000-000.cfg"

    # Copy simulation files into this folder
    cp *.o "Temp_Directory_$counter/"
    cp *.cpp "Temp_Directory_$counter/"
    cp *.h "Temp_Directory_$counter/"
    cp Makefile "Temp_Directory_$counter/"
    cp "$val" "Temp_Directory_$counter/"
    cp Bond_Param_Gen.sh "Temp_Directory_$counter/Bond_Param_Gen.sh"

    # Change directory to temporary directory
    cd "Temp_Directory_$counter"
    
    # Run the main MD simulation. The output will be stored in the current directory
    bash "$val" "$enval" "$flval"

    # Make directory to store the bond-order files
    mkdir Data/
    mv *.txt Data/
    rm Data/*.txt
    bash Bond_Param_Gen.sh
    mv *.txt Data/
    mv *.cfg Data/

    # Home directory or scratch directory
    directory="/global/home/users/shoubhaniknath"
    new_filename="Data ${flval} impacts energy ${enval} number ${counter}"

    # Rename and move the data folder
    mv "Data" "$directory/$new_filename"
}

# Export the function so that GNU Parallel can access it
export -f run_simulation

# Set number of jobs based on number of cores available and number of threads per core
export JOBS_PER_NODE=$(( $SLURM_CPUS_ON_NODE / $SLURM_CPUS_PER_TASK ))

# Run simulations in parallel
for enval in "${energy_list[@]}"; do
    for flval in "${fluence_list[@]}"; do
        # Use GNU Parallel to parallelize the loop over 'counter'
    # Use below line for multiple nodes
    #  parallel --dry-run --jobs $JOBS_PER_NODE --slf hostfile run_simulation "$enval" "$flval" {} "$val" ::: {1..3}
    # For single node, use below line
    echo $JOBS_PER_NODE
    parallel --jobs $JOBS_PER_NODE --joblog task.log --resume --bar run_simulation "$enval" "$flval" {} "$val" ::: {1..3}
    done
done

My issue is that I am not able to print the progress bar of the parallel, and have no idea why. Simple parallel commands executed in the current working directory do show the progress bar. What am I doing wrong here?

Upvotes: 0

Answers (2)

Shoubhanik Nath

Reputation: 1

Figured it out later on. The progress bar will only be displayed in the compute node, so to see the progress bar, one should use srun

Upvotes: 0

Ole Tange

Reputation: 33740

Try something like this:

parallel --jobs $JOBS_PER_NODE --joblog task.log --resume --bar run_simulation {2} {3} {1} "$val" ::: {1..3} ::: "${energy_list[@]}" ::: "${fluence_list[@]}"

Upvotes: 0

Progress Bar not displaying in GNU Parallel with SLURM script

Answers (2)

Related Questions