Ori Kovacsi-Katz
Ori Kovacsi-Katz

Reputation: 21

With SLURM workload manager, How can I disable prolog and epiloge just for the specific job I am launcing now?

I have a slurm script(sbatch)/command-line(srun)/MPI(--mpi=mpix) that stopped working after some prolog + epilogue hooks got activated.

I wish to rule-out the option that some environment variables set by the hooks cause it. For this I must disable from non root user level the launching of any hooks.

How can it be done under slurm?

Upvotes: 0

Views: 43

Answers (1)

j23
j23

Reputation: 3530

Unfortunately, you cannot disable prolog or epilogue scripts as a non-root user. You can ask your admin to create maybe a separate partition (with minimal number of nodes for fairness purpose) for you which has disabled both prologue and epilogue scripts.

As a non-root user, you can kind of perform a brute force approach and check whether the prologue script is messing your execution by clearing out the environment in the beginning of your sbatch script and starting a fresh environment inside your job script.

env -i /bin/bash --noprofile --norc

This might require that you need to set PATH and other environment variables by yourself. Or you can keep the selected environment variables during the creation of a clean environment (as below)

env -i PATH=$PATH SLURM_JOBID=$SLURM_JOBID SLURM_NTASKS=$SLURM_NTASKS /usr/bin/env bash --noprofile --norc

The bash --noprofile --norc command starts a fresh shell without loading any profile files.

You could also get the slurm environment files and write it to a file in case you need to read it back later in the script.

env | grep ^SLURM_ > slurm_env.sh

and later source it like source slurm_env.sh.

Also, use set -a and set +a before source slurm_env.sh to automatically export all defined variables to the environment.

If you have admin priviliges, you can tackle it in 2 ways.

1. Create a partition for a user

2. Create a QOS for a user

Both techniques require editing the slurm configuration file (slurm.conf) file and reloading afterwards.

1. Create a partition for a user

Add a custom partition for the user in the slurm.conf file:

PartitionName=user_partition Nodes=ALL Default=NO State=UP
Prolog=
Epilog=

Then, usse this partition during job submissions by appending the --partition=user_partition flag.

  1. Create a QOS for a user

Edit slurm.conf and add a custom QoS:

QoS=user_qos Priority=100000
Prolog= 
Epilog=

Then assign the QoS to the user as follows

sacctmgr add user <username> qos=user_qos

After editing slurm.conf, reload the Slurm configuration to apply changes.

scontrol reconfigure

Upvotes: 1

Related Questions