OpenMPI Job Submission Example
This page provides an example of submitting a simple MPI job to the
cluster using the OpenMPI MPI library.J It is based on the
Basic_MPI_Job a.k.a. Basic_OpenMPI a.k.a. HelloUMD-MPI_gcc_openmpi
job template in
the OnDemand portal.
This job makes use of a simple
Hello World! program called hello-umd
available in the
UMD HPC cluster software library and which supports sequential, multithreaded,
and MPI modes of operation. The code simply prints an identifying message
from each thread of each task --- for this pure MPI case each task consists
of a single thread, so it will print a message from each MPI task.
Overview
This example basically consists of a single file, the job script
submit.sh
(see for a listing and explanation
of the script) which gets submitted to the cluster via the
sbatch
command.
The script is designed to show many good practices; including:
- setting standard sbatch options within the script
- loading the needed modules within the script
- printing some useful diagnostic information at start of the script
- creating a job specific work directory
- running the code and saving the exit code
- exiting with the exit code from the main application
Many of the practices above are rather overkill for such a simple job ---
indeed, the vast majority of lines are for these "good practices" rather
than the running of the intended code, but are included for educational
purposes.
This code runs hello-umd
in sequential mode, saving the output
to a file in the temporary work directory, and then copying back to the
submission directory. We could have forgone all that and simply have the
output of hello-umd
go to
standard output,
which would be available in the slurm-JOBNUMBER.out
file (or whatever file you instructed Slurm to use instead). Doing such is
acceptable as long as the code is not producing an excessive amount (many
MBs) of output --- if the code produces a lot of output having it all sent
to Slurm output file can cause problems, and it is better to redirect to
a file.
The submission script
The submission script submit.sh
can be
downloaded
as plain text. We present a copy with line numbers for discussion
below (click on lines to link to discussion for those lines):
Source of submit.sh
HelloUMD-Sequential job submission script
Line# |
Code |
1
|
#!/bin/bash
|
2
|
# The line above this is the "shebang" line. It must be first line in script
|
3
|
#-----------------------------------------------------
|
4
|
# OnDemand Job Template for Hello-UMD, MPI version
|
5
|
# Runs a simple MPI enabled hello-world code
|
6
|
#-----------------------------------------------------
|
7
|
#
|
8
|
# Slurm sbatch parameters section:
|
9
|
# Request 60 MPI tasks with 1 CPU core each
|
10
|
#SBATCH -n 60
|
11
|
#SBATCH -c 1
|
12
|
# Request 5 minutes of walltime
|
13
|
#SBATCH -t 5
|
14
|
# Request 1 GB of memory per CPU core
|
15
|
#SBATCH --mem-per-cpu=1024
|
16
|
# Do not allow other jobs to run on same node
|
17
|
#SBATCH --exclusive
|
18
|
# Run on debug partition for rapid turnaround. You will need
|
19
|
# to change this (remove the line) if walltime > 15 minutes
|
20
|
#SBATCH --partition=debug
|
21
|
# Do not inherit the environment of the process running the
|
22
|
# sbatch command. This requires you to explicitly set up the
|
23
|
# environment for the job in this script, improving reproducibility
|
24
|
#SBATCH --export=NONE
|
25
|
#
|
26
|
|
27
|
# This job will run the MPI enabled version of hello-umd
|
28
|
# We create a directory on parallel filesystem from where we actually
|
29
|
# will run the job.
|
30
|
|
31
|
# Section to ensure we have the "module" command defined
|
32
|
unalias tap >& /dev/null
|
33
|
if [ -f ~/.bash_profile ]; then
|
34
|
source ~/.bash_profile
|
35
|
elif [ -f ~/.profile ]; then
|
36
|
source ~/.profile
|
37
|
fi
|
38
|
|
39
|
# Set SLURM_EXPORT_ENV to ALL. This prevents the --export=NONE flag
|
40
|
# from being passed to mpirun/srun/etc, which can cause issues.
|
41
|
# We want the environment of the job script to be passed to all
|
42
|
# tasks/processes of the job
|
43
|
export SLURM_EXPORT_ENV=ALL
|
44
|
|
45
|
# Module load section
|
46
|
# First clear our module list
|
47
|
module purge
|
48
|
# and reload the standard modules
|
49
|
module load hpcc/deepthought2
|
50
|
# Load the desired compiler, MPI, and package modules
|
51
|
# NOTE: You need to use the same compiler and MPI module used
|
52
|
# when compiling the MPI-enabled code you wish to run (in this
|
53
|
# case hello-umd). The values # listed below are correct for the
|
54
|
# version of hello-umd we will be using, but you may need to
|
55
|
# change them if you wish to run a different package.
|
56
|
module load gcc/8.4.0
|
57
|
module load openmpi/3.1.5
|
58
|
module load hello-umd/1.5
|
59
|
|
60
|
# Section to make a scratch directory for this job
|
61
|
# Because different MPI tasks, which might be on different nodes, and will need
|
62
|
# access to it, we put it in a parallel file system.
|
63
|
# We include the SLURM jobid in the directory name to avoid interference
|
64
|
# if multiple jobs running at same time.
|
65
|
TMPWORKDIR="/lustre/$USER/ood-job.${SLURM_JOBID}"
|
66
|
mkdir $TMPWORKDIR
|
67
|
cd $TMPWORKDIR
|
68
|
|
69
|
# Section to output information identifying the job, etc.
|
70
|
echo "Slurm job ${SLURM_JOBID} running on"
|
71
|
hostname
|
72
|
echo "To run on ${SLURM_NTASKS} CPU cores across ${SLURM_JOB_NUM_NODES} nodes"
|
73
|
echo "All nodes: ${SLURM_JOB_NODELIST}"
|
74
|
date
|
75
|
pwd
|
76
|
echo "Loaded modules are:"
|
77
|
module list
|
78
|
echo "Job will be started out of $TMPWORKDIR"
|
79
|
|
80
|
|
81
|
# Setting this variable will suppress the warnings
|
82
|
# about lack of CUDA support on non-GPU enabled nodes. We
|
83
|
# are not using CUDA, so warning is harmless.
|
84
|
export OMPI_MCA_mpi_cuda_support=0
|
85
|
|
86
|
# Get the full path to our hello-umd executable. It is best
|
87
|
# to provide the full path of our executable to mpirun, etc.
|
88
|
MYEXE=`which hello-umd`
|
89
|
echo "Using executable $MYEXE"
|
90
|
|
91
|
# Run our script using mpirun
|
92
|
# We do not specify the number of tasks here, and instead rely on
|
93
|
# it defaulting to the number of tasks requested of Slurm
|
94
|
mpirun ${MYEXE} > hello.out 2>&1
|
95
|
# Save the exit code from the previous command
|
96
|
ECODE=$?
|
97
|
|
98
|
# Output from the above command was placed in a work directory in a parallel
|
99
|
# filesystem. That parallel filesystem does _not_ get cleaned up automatically.
|
100
|
# And it is not normally visible from the Job Composer.
|
101
|
# To deal with this, we make a symlink from the job submit directory to
|
102
|
# the work directory for the job.
|
103
|
#
|
104
|
# NOTE: The work directory will continue to exist until you delete it. It will
|
105
|
# not get deleted when you delete the job in Job Composer.
|
106
|
|
107
|
ln -s ${TMPWORKDIR} ${SLURM_SUBMIT_DIR}/work-dir
|
108
|
|
109
|
echo "Job finished with exit code $ECODE. Work dir is $TMPWORKDIR"
|
110
|
date
|
111
|
|
112
|
# Exit with the cached exit code
|
113
|
exit $ECODE
|
Discussion of submit.sh
Running the example
The easiest way to run this example is with the
Job Composer of
the OnDemand portal, using
the HelloUMD-Sequential
template.
To submit from the command line, just
- Download the
submit.sh
script to the HPC login node.
- Run the command
sbatch submit.sh
. This will submit the job
to the scheduler, and should return a message like
Submitted batch job 23767
--- the number will vary (and is the
job number for this job). The job number can be used to reference
the job in Slurm, etc. (Please always give the job number(s) when requesting
help about a job you submitted).
Whichever method you used for submission, the job will be queued for the
debug partition and should run within 15 minutes or so. When it finishes
running, the slurm-JOBNUMBER.out
should contain
the output from our diagnostic commands (time the job started, finished,
module list, etc). The output of the hello-umd
will be in
the file hello.out
in the job specific work directory
created in your lustre directory. For the convenience of users of the
OnDemand portal, a symlink to this directory is created in the submission
directory. So if you used OnDemand, a symlink to the work directory will
appear in the Folder contents
section on the right.
Back to Top