Reputation: 73
I have a small cluster with nodes A, B, C and D. Each node has 80GB RAM and 32 CPUs. I am using Slurm 17.11.7.
I performed the following benchmark tests:
I already tried SelectTypeParameters=CR_CPU_Memory and SelectTypeParameters=CR_Core with the same result.
Why is my array job 4 times slower? Thanks for your help!
The header of my array job, which I submit, looks like this:
#!/bin/bash -l
#SBATCH --array=1-42
#SBATCH --job-name exp
#SBATCH --output logs/output_%A_%a.txt
#SBATCH --error logs/error_%A_%a.txt
#SBATCH --time=20:00
#SBATCH --mem=2048
#SBATCH --cpus-per-task=1
#SBATCH -w <NodeA>
The slurm.conf file looks like:
ControlMachine=<NodeA>
ControlAddr=<IPNodeA>
MpiDefault=none
ProctrackType=proctrack/cgroup
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=<test_user_123>
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/affinity
MaxJobCount=100000
MaxArraySize=15000
MinJobAge=300
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_CPU_Memory
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=Cluster
JobAcctGatherType=jobacct_gather/none
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdLogFile=/var/log/slurmd.log
# COMPUTE NODES
#NodeName=NameA-D> State=UNKNOWN
NodeName=<NameA> NodeAddr=<IPNodeA> State=UNKNOWN CPUs=32 RealMemory=70363
NodeName=<NameB> NodeAddr=<IPNodeB> State=UNKNOWN CPUs=32 RealMemory=70363
NodeName=<NameC> NodeAddr=<IPNodeC> State=UNKNOWN CPUs=32 RealMemory=70363
NodeName=<NameD> NodeAddr=<IPNodeD> State=UNKNOWN CPUs=32 RealMemory=70363
PartitionName=debug Nodes=<NodeA-D> Default=YES MaxTime=INFINITE State=UP
Upvotes: 1
Views: 2220
Reputation: 59260
If the running time does not depend on the value of the parameter in the Java application, there are two possible explanations:
Either your cgroup
configuration does not confine your jobs and your Java code is multithreaded. In such case, if you run only one job, or if you run directly on the node, your single task uses several CPUs in parallel. If you run a job array that saturates the node, each task only can use a single CPU.
Or, your node is configured with hyper threading. In such case, if you run only one job, or if you run directly on the node, your single task can use a full CPU. If you run a job array that saturates the node, each task must share a physical CPU with another one.
Upvotes: 2