March 07, 2021
Rank | Nation | Machine | Performance | Accelerators |
---|---|---|---|---|
1. | ![]() |
Fugaku | 442 PFLOPs/s | |
2. | ![]() |
Summit | 149 PFLOPs/s | NVIDIA V100 |
3. | ![]() |
Sierra | 95 PFLOPs/s | NVIDIA V100 |
4. | ![]() |
Sunway TaihuLight | 93 PFLOPs/s | |
5. | ![]() |
Selene | 64 PFLOPs/s | NVIDIA A100 |
6. | ![]() |
Tianhe-2A | 62 PFLOPs/s | |
7. | ![]() |
Juwels Booster | 44 PFLOPs/s | NVIDIA A100 |
8. | ![]() |
HPC5 | 36 PFLOPs/s | NVIDIA V100 |
9. | ![]() |
Frontera | 24 PFLOPs/s | NVIDIA RTX5000/V100 |
10. | ![]() |
Dammam-7 | 21 PFLOPs/s | NVIDIA V100 |
Model | #cores | Clock Freq (GHz) | Memory (GB) | Bandwidth (GB/s) | TDP (Watt) | FP32/FP64 (GFLOPs/s) |
---|---|---|---|---|---|---|
36+50x GeForce GTX-1080 n37[1,2,3]-[001-004,001-022,001-028] | ||||||
![]() |
2560 | 1.61 | 8 | 320 | 180 | 8228/257 |
4x Tesla k20m n372-02[4,5] | ||||||
![]() |
2496 | 0.71 | 5 | 208 | 195 | 3520/1175 |
1x Tesla V100 n372-023] | ||||||
![]() |
5120/644 | 1.31 | 32 | 900 | 250 | 14000/7000 |
Interactive mode
1. VSC-3 > salloc -N 1 -p gpu_gtx1080single --qos gpu_gtx1080single
2. VSC-3 > squeue -u $USER
3. VSC-3 > srun -n 1 hostname (...while still on the login node !)
4. VSC-3 > ssh n372-012 (...or whatever else node had been assigned)
5. VSC-3 > module load cuda/9.1.85
cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMul
nvcc ./matrixMul.cu
./a.out
cd ~/examples/09_special_hardware/gpu_gtx1080/matrixMulCUBLAS
nvcc matrixMulCUBLAS.cu -lcublas
./a.out
6. VSC-3 > nvidia-smi
7. VSC-3 > /opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery
SLURM submission gpu_test.scrpt
#!/bin/bash
#
# usage: sbatch ./gpu_test.scrpt
#
#SBATCH -J gtx1080
#SBATCH -N 1
#SBATCH --partition gpu_gtx1080single
#SBATCH --qos gpu_gtx1080single
module purge
module load cuda/9.1.85
nvidia-smi
/opt/sw/x86_64/glibc-2.17/ivybridge-ep/cuda/9.1.85/NVIDIA_CUDA-9.1_Samples/1_Utilities/deviceQuery/deviceQuery
Exercise/Example/Problem:
Using interactive mode or batch submission, figure out whether we have ECC enabled on GPUs of type gtx1080 ?
Interactive mode
1. VSC-3 > salloc -N 1 -p binf --qos normal_binf -C binf -L intel@vsc
(... add --nodelist binf-13 to request a specific node)
2. VSC-3 > squeue -u $USER
3. VSC-3 > srun -n 4 hostname (... while still on the login node !)
4. VSC-3 > ssh binf-11 (... or whatever else node had been assigned)
5. VSC-3 > module purge
6. VSC-3 > module load intel/17
cd examples/09_special_hardware/binf
icc -xHost -qopenmp sample.c
export OMP_NUM_THREADS=8
./a.out
SLURM submission slrm.sbmt.scrpt
#!/bin/bash
#
# usage: sbatch ./slrm.sbmt.scrpt
#
#SBATCH -J gmxbinfs
#SBATCH -N 2
#SBATCH --partition binf
#SBATCH --qos normal_binf
#SBATCH -C binf
#SBATCH --ntasks-per-node 24
#SBATCH --ntasks-per-core 1
module purge
module load intel/17 intel-mkl/2017 intel-mpi/2017 gromacs/5.1.4_binf
export I_MPI_PIN=1
export I_MPI_PIN_PROCESSOR_LIST=0-23
export I_MPI_FABRICS=shm:tmi
export I_MPI_TMI_PROVIDER=psm2
export OMP_NUM_THREADS=1
export MDRUN_ARGS=" -dd 0 0 0 -rdd 0 -rcon 0 -dlb yes -dds 0.8 -tunepme -v -nsteps 10000 "
mpirun -np $SLURM_NTASKS gmx_mpi mdrun ${MDRUN_ARGS} -s hSERT_5HT_PROD.0.tpr -deffnm hSERT_5HT_PROD.0 -px hSERT_5HT_PROD.0_px.xvg -pf hSERT_5HT_PROD.0_pf.xvg -swap hSERT_5HT_PROD.0.xvg
Performance | Power Efficiency |
---|---|
![]() |
![]() |