GROMACS 2019.3 ¶

Table of Contents

GROMACS 2019.3

Basic information ¶

Official Website: http://manual.gromacs.org/documentation/
License: GNU Lesser General Public License (LGPL), version 2.1.
Installed on: Apolo II

Tested on (Requirements)¶

OS base: CentOS (x86_64) $\boldsymbol{\ge}$ 6.6 (Rocks 6.2)
Compiler: Intel MPI Library $\boldsymbol{\ge}$ 17.0.1 (Apolo)
Math Library: FFTW 3.3.8 (Built in) and OpenBlas 0.2.19

Installation ¶

The following procedure present the way to compile GROMACS 2019.3 for parallel computing using the GROMACS built-in thread-MPI and CUDA. [1]

Note

For the building, the Intel compiler 2017 was used due to compatibility issues with CUDA which only supports, for Intel as backend compiler, up to 2017 version.

Download the latest version of GROMACS

$ wget http://ftp.gromacs.org/pub/gromacs/gromacs-2019.3.tar.gz
$ tar xf gromacs-2019.3.tar.gz

Inside the folder, on the top create a build directory where the installation binaries will be put by cmake.
```
$ cd gromacs-2019.3
$ mkdir build
$ cd build
```

Load the necessary modules for the building.

$ module load cmake/3.7.1 \
              cuda/9.0 \
              openblas/0.2.19_gcc-5.4.0 \
              intel/2017_update-1 \
              python/2.7.15_miniconda-4.5.4

Execute the cmake command with the desired directives.

$ cmake .. -DGMX_GPU=on -DCUDA_TOOLKIT_ROOT_DIR=/share/apps/cuda/9.0/ -DGMX_CUDA_TARGET_SM="30;37;70" \
            -DGMX_SIMD=AVX2_256 -DCMAKE_INSTALL_PREFIX=/share/apps/gromacs/2019.3_intel-17_cuda-9.0 \
            -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=ON -DGMX_EXTERNAL_BLAS=on -DREGRESSIONTEST_DOWNLOAD=on

Note

The above command will enable the GPU usage with CUDA for the specified architecures, sm_30 and sm_37 for Tesla K80 and sm_70 for V100 because these are the GPUs used in Apolo. [2]

Note

For “FFT_LIBRARY” there are some options like Intel MKL. Generally, it is recommended to use the FFTW because there is no advantage in using MKL with GROMACS, and FFTW is often faster. [1]

To build the distributed GROMACS version you have to use an MPI library. The GROMACS team recommends OpenMPI version 1.6 (or higher), MPICH version 1.4.1 (or higher).

$ module load cmake/3.7.1 \
              cuda/9.0 \
              openblas/0.2.19_gcc-5.4.0 \
              openmpi/1.10.7_gcc-5.4.0 \
              python/2.7.15_miniconda-4.5.4

$ cmake .. -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DGMX_MPI=on -DGMX_GPU=on \
           -DCUDA_TOOLKIT_ROOT_DIR=/share/apps/cuda/9.0/ -DGMX_CUDA_TARGET_SM="30;37;70" \
           -DGMX_SIMD=AVX2_256 -DCMAKE_INSTALL_PREFIX=/share/apps/gromacs/2019.3_intel-17_cuda-9.0 \
           -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=ON -DGMX_EXTERNAL_BLAS=on -DREGRESSIONTEST_DOWNLOAD=on

For more information about the compile options you can refer to the Gromacs Documentation. [1]

Execute the make commands sequence.
```
$ make -j <N>
$ make check
$ make -j <N> install
```
Warning

Some tests may fail, but the installation can continue depending on the number of failed tests.

Usage ¶

This section describes a way to submit jobs with the resource manager SLURM.

Load the necessary environment.

# Apolo
module load gromacs/2019.3_intel-17_cuda-9.0

# Cronos
module load gromacs/2016.4_gcc-5.5.0

Run Gromacs with SLURM.

An example with GPU (Apolo), given by one of our users:

#!/bin/bash

#SBATCH --job-name=gmx-GPU
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=4
#SBATCH --time=10:00:00
#SBATCH --partition=accel-2
#SBATCH --gres=gpu:2
#SBATCH --output=gmx-GPU.%j.out
#SBATCH --error=gmx-GPU.%j.err

module load gromacs/2019.3_intel-17_cuda-9.0

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

gmx grompp -f step6.0_minimization.mdp -o step6.0_minimization.tpr -c step5_charmm2gmx.pdb -r step5_charmm2gmx.pdb -p topol.top
gmx mdrun -v -deffnm step6.0_minimization -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01

# Equilibration
cnt=1
cntmax=6

while [ $cnt -le $cntmax ]; do
    pcnt=$((cnt-1))
    if [ $cnt -eq 1 ]; then
	gmx grompp -f step6.${cnt}_equilibration.mdp -o step6.${cnt}_equilibration.tpr -c step6.${pcnt}_minimization.gro -r step5_charmm2gmx.pdb -n index.ndx -p topol.top
        gmx mdrun -v -deffnm step6.${cnt}_equilibration -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01
    else
	gmx grompp -f step6.${cnt}_equilibration.mdp -o step6.${cnt}_equilibration.tpr -c step6.${pcnt}_equilibration.gro -r step5_charmm2gmx.pdb -n index.ndx -p topol.top
        gmx mdrun -v -deffnm step6.${cnt}_equilibration -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01
    fi
    ((cnt++))
done

# Production
cnt=1
cntmax=10

while [ $cnt -le $cntmax ]; do
    if [ $cnt -eq 1 ]; then
        gmx grompp -f step7_production.mdp -o step7_${cnt}.tpr -c step6.6_equilibration.gro -n index.ndx -p topol.top
        gmx mdrun -v -deffnm step7_${cnt} -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01
    else
	pcnt=$((cnt-1))
	gmx grompp -f step7_production.mdp -o step7_${cnt}.tpr -c step7_${pcnt}.gro -t step7_${pcnt}.cpt -n index.ndx -p topol.top
        gmx mdrun -v -deffnm step7_${cnt} -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01
    fi
    ((cnt++))
done

Note lines 18, 28, 31, 43, 47 the use of gmx mdrun with the flag -gpu_id 01:

If Gromacs was compiled with Cuda, it will use the GPUs available by default.
The flag -gpu_id 01 tells Gromacs which GPUs can be used. The 01 means use GPU with device ID 0 and GPU with device ID 1.
Note in line 9 the use of #SBATCH –gres=gpu:2. gres stands for generic resource scheduling. gpu requests GPUs to Slurm, and :2 specifies the quantity.
Note that we have 3 GPUs in Accel-2, but we are indicating only two GPUs. This is useful when some other user is using one or more GPUs.
Also, note that the number of tasks per node must be a multiple of the number of GPUs that will be used.
Setting a cpus-per-task to a value between 2 and 6 seems to be more efficient than values greather than 6.
The files needed to run the example above are here.
For more information see [3].

An example with CPU only (Cronos):

#!/bin/bash

################################################################################
################################################################################
#
# Find out the density of TIP4PEW water.
# How to run the simulation was taken from:
# https://www.svedruziclab.com/tutorials/gromacs/1-tip4pew-water/
#
################################################################################
################################################################################

#SBATCH --job-name=gmx-CPU
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=16
#SBATCH --time=03:00:00
#SBATCH --partition=longjobs
#SBATCH --output=gmx-CPU.%j.out
#SBATCH --error=gmx-CPU.%j.err
#SBATCH --mail-user=example@eafit.edu.co
#SBATCH --mail-type=END,FAIL

module load gromacs/2016.4_gcc-5.5.0

# Create box of water.
gmx_mpi solvate -cs tip4p -o conf.gro -box 2.3 2.3 2.3 -p topol.top

# Minimizations.
gmx_mpi grompp -f mdp/min.mdp -o min -pp min -po min
srun --mpi=pmi2 gmx_mpi mdrun -deffnm min

gmx_mpi grompp -f mdp/min2.mdp -o min2 -pp min2 -po min2 -c min -t min
srun --mpi=pmi2 gmx_mpi mdrun -deffnm min2

# Equilibration 1.
gmx_mpi grompp -f mdp/eql.mdp -o eql -pp eql -po eql -c min2 -t min2
srun --mpi=pmi2 gmx_mpi mdrun -deffnm eql

# Equilibration 2.
gmx_mpi grompp -f mdp/eql2.mdp -o eql2 -pp eql2 -po eql2 -c eql -t eql
srun --mpi=pmi2 gmx_mpi mdrun -deffnm eql2

# Production.
gmx_mpi grompp -f mdp/prd.mdp -o prd -pp prd -po prd -c eql2 -t eql2
srun --mpi=pmi2 gmx_mpi mdrun -deffnm prd

Note the use of gmx_mpi instead of gmx.
Also, note the use of srun --mpi=pmi2 instead of mpirun -np <num-tasks>. The command srun --mpi=pmi2 gives to gmx_mpi the context of where and how many tasks to run.
In lines 13 and 14 note that it is requesting 4 nodes and 16 mpi tasks on each node. Recall that each node in Cronos has 16 cores.
In lines 16, 29, 32, 36, 40, 44 note that srun --mpi=pmi2 is not used. This is due that, those are preprocessing steps, they do not need to run distributedly.
The needed files to run the example simulation can be found here.

References ¶

[1]	(1, 2, 3) GROMACS Documentation. (2019, June 14). GROMACS. Fast, Flexible and Free. Retrieved July 10, 2019, from http://manual.gromacs.org/documentation/current/manual-2019.3.pdf

[2]	Matching SM architectures. (2019, November 11). Blame Arnon Blog. Retrieved July 10, 2019, from https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/

[3]	Getting good performance from mdrun. (2019). GROMACS Development Team. Retrieved September 3, 2019, from http://manual.gromacs.org/documentation/current/user-guide/mdrun-performance.html#running-mdrun-within-a-single-node

Authors ¶

Johan Sebastián Yepes Ríos <jyepesr1@eafit.edu.co>
Hamilton Tobón Mosquera <htobonm@eafit.edu.co>

GROMACS 2019.3¶

Basic information¶

Tested on (Requirements)¶

Installation¶

Usage¶

References¶

Authors¶

GROMACS 2019.3 ¶

Basic information ¶

Installation ¶

Usage ¶

References ¶

Authors ¶