aaaaa
aaaaa

Reputation: 183

HPC slurm - how to make an HPC node run multiple jobs' bash scripts at the same time

Let's suppose I have an HPC cluster with one node (node_1) and I want to send and run at the same time 3 jobs' bash scripts in node_1.

So far, when I send a job to node_1 the node is kept busy until the job ends.

How can I do this? Shall I provide any specific argument in the job's bash script?

thanks


Update

Here below an example of a bash script I am using to send a job to the HPC:

#!/bin/bash
#SBATCH --job-name=my_job
#SBATCH --partition=test
#SBATCH --nodelist=node_1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=8000
#SBATCH --output=1.out
#SBATCH --error=1.err

python /my/HPC/folder/script.py

Update

(base) [id@login_node ~]$ scontrol show node=node_1
NodeName=node_1 Arch=x86_64 CoresPerSocket=32 
   CPUAlloc=0 CPUTot=64 CPULoad=2.94
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=node_1 NodeHostName=node_1 Version=18.08
   OS=Linux 4.20.0-1.el7.elrepo.x86_64 #1 SMP Sun Dec 23 20:11:51 EST 2018 
   RealMemory=128757 AllocMem=0 FreeMem=111815 Sockets=1 Boards=1
   State=IDLE ThreadsPerCore=2 TmpDisk=945178 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=test 
   BootTime=2019-12-09T14:09:25 SlurmdStartTime=2020-02-18T03:45:14
   CfgTRES=cpu=64,mem=128757M,billing=64
   AllocTRES=
   CapWatts=n/a
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s

Upvotes: 0

Views: 624

Answers (1)

j23
j23

Reputation: 3530

You need to change the consumable resource type from nodes to cores in slurm.

Add this to your slurm.conf file

SelectType=select/cons_res
SelectTypeParameters=CR_Core

SelectType: Controls whether CPU resources are allocated to jobs and job steps in units of whole nodes or as consumable resources (sockets, cores or threads).

SelectTypeParameters: Defines the consumable resource type and controls other aspects of CPU resource allocation by the select plugin. Reference

Also, the node description should also allows for that:

NodeName=<somename> NodeAddr=<someaddress> CPUs=16 Sockets=2 CoresPerSocket=4 ThreadsPerCore=2 RealMemory=12005 State=UNKNOWN

See also serverfault

Upvotes: 1

Related Questions