The ‘mixed mode’ (MPI+OpenMP) programming model is supported on ARC2, ARC3 and Polaris (the N8 shared cluster). This typically involves MPI processes running across nodes and OpenMP threads oneach node with the total number of processes (MPI*OpenMP) equaling the number of physical processor cores.
Your code will need to call MPI_Init and make use of OpenMP directives. You will compile your code using an MPI wrapper and enabling OpenMP support, for example
mpif90 -openmp example.f90 -o mixed.exe
You will need to determine ppn , the number of MPI processes per node, and tpp , the number of OpenMP threads per MPI process.
Additionally, you can either ask for a given number of nodes nodes or for the total number of MPI processes np . Note that ppn is related to np since ppn = np/nodes .
Your submission script would then need to contain:
#$ -V #$ -l hr_t=01:00:00 #$ -l nodes=$nodes,ppn=$ppn,tpp=$tpp mpirun ./a.out
#$ -V #$ -l hr_t=01:00:00 #$ -l np=$np,ppn=$ppn,tpp=$tpp mpirun ./a.out
Given there are 16 cores per node, you would typically ensure ppn*tpp=16
To run an MPI+OpenMP executable mixed.exe with 64 MPI processes each launching 4 OpenMP threads, the following submission script would be needed:
#$ -V #$ -cwd #$ -b y #$ -l hr_t=01:00:00 #$ -l np=64,ppn=4,tpp=4 mpirun ./mixed.exe
This will allocate 16 nodes (=16*16=256 cores).
Each node will have 4 MPI processes, each of which will have 4 OpenMP threads (so 4*4=16 processes per node in total, and 16*16=256 (=64MPI*4OpenMP) processes in total.
Alternatively, the same effect can be achieved by:
#$ -V #$ -cwd #$ -b y #$ -l hr_t=01:00:00 #$ -l nodes=16,ppn=4,tpp=4 mpirun ./mixed.exe
Note that the OMP_NUM_THREADS environment variable is automatically set by the batch system and so you do not need to set this in your environment.