SLURM more advanced usage

Bad job practices

for i in {1..1000} 
do 
    sbatch job.sh $i
done
for i in {1..1000}
do
    mpirun my_program $i
done

Array jobs

#!/bin/sh
#SBATCH -J array
#SBATCH -N 1
#SBATCH --array=1-10

echo "Hi, this is array job number"  $SLURM_ARRAY_TASK_ID
sleep $SLURM_ARRAY_TASK_ID
VSC-4 >  squeue  -u $user
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
     406846_[7-10]  mem_0096    array       sh PD       0:00      1 (Resources)
          406846_4  mem_0096    array       sh  R    INVALID      1 n403-062
          406846_5  mem_0096    array       sh  R    INVALID      1 n403-072
          406846_6  mem_0096    array       sh  R    INVALID      1 n404-031
VSC-4 >  ls slurm-*
slurm-406846_10.out  slurm-406846_3.out  slurm-406846_6.out  slurm-406846_9.out
slurm-406846_1.out   slurm-406846_4.out  slurm-406846_7.out
slurm-406846_2.out   slurm-406846_5.out  slurm-406846_8.out
VSC-4 >  cat slurm-406846_8.out
Hi, this is array job number  8

Array jobs cont.

#SBATCH --array=1-20:5
#SBATCH --array=1-20:5%2

Single core jobs

for ((i=1; i<=48; i++))
do
   stress --cpu 1 --timeout $i  &
done
wait

Combination of array & single core job

...
#SBATCH --array=1-144:48

j=$SLURM_ARRAY_TASK_ID
((j+=47))

for ((i=$SLURM_ARRAY_TASK_ID; i<=$j; i++))
do
   stress --cpu 1 --timeout $i  &
done
wait

Exercises

Job/process setup

#SBATCH job environment
-N SLURM_JOB_NUM_NODES
--ntasks-per-core SLURM_NTASKS_PER_CORE
--ntasks-per-node SLURM_NTASKS_PER_NODE
--ntasks, -n SLURM_NTASKS
#SBATCH --mail-user=yourmail@example.com
#SBATCH --mail-type=BEGIN,END

Submit scripts tuning

#SBATCH -t, --time=<time>

time format:

Licenses

VSC-3 >  slic

Within the SLURM submit script add the flags as shown with ‘slic’, e.g. when both Matlab and Mathematica are required

#SBATCH -L matlab@vsc,mathematica@vsc

Intel licenses are needed only for compilation not when running resulting executables

Reservation of compute nodes

VSC-4 >  scontrol show reservations
#SBATCH --reservation=

Exercises

echo "2+2" | matlab

MPI + pinning

Example: Two nodes with two MPI processes each:

srun

#SBATCH -N 2
#SBATCH --tasks-per-node=2

srun --cpu_bind=map_cpu:0,24 ./my_mpi_program

mpirun

#SBATCH -N 2
#SBATCH --tasks-per-node=2

export I_MPI_PIN_PROCESSOR_LIST=0,24   # Intel MPI syntax 
mpirun ./my_mpi_program

Job dependencies

  1. Submit first job and get its <job id>
  2. Submit dependent job (and get <job_id>):
#!/bin/bash
#SBATCH -J jobname
#SBATCH -N 2
#SBATCH -d afterany:<job_id>
srun  ./my_program
  1. continue at 2. for further dependent jobs

Back to Agenda

AGENDA – VSC-Intro