GROMACS 2019.3
Basic information
Official Website: http://manual.gromacs.org/documentation/
License: GNU Lesser General Public License (LGPL), version 2.1.
Installed on: Apolo II
Tested on (Requirements)
OS base: CentOS (x86_64) \(\boldsymbol{\ge}\) 6.6 (Rocks 6.2)
Compiler: Intel MPI Library \(\boldsymbol{\ge}\) 17.0.1 (Apolo)
Math Library: FFTW 3.3.8 (Built in) and OpenBlas 0.2.19
Installation
The following procedure present the way to compile GROMACS 2019.3 for parallel computing using the GROMACS built-in thread-MPI and CUDA. [1]
Note
For the building, the Intel compiler 2017 was used due to compatibility issues with CUDA which only supports, for Intel as backend compiler, up to 2017 version.
Download the latest version of GROMACS
$ wget http://ftp.gromacs.org/pub/gromacs/gromacs-2019.3.tar.gz $ tar xf gromacs-2019.3.tar.gz
Inside the folder, on the top create a
builddirectory where the installation binaries will be put by cmake.$ cd gromacs-2019.3 $ mkdir build $ cd build
Load the necessary modules for the building.
$ module load cmake/3.7.1 \ cuda/9.0 \ openblas/0.2.19_gcc-5.4.0 \ intel/2017_update-1 \ python/2.7.15_miniconda-4.5.4
Execute the cmake command with the desired directives.
$ cmake .. -DGMX_GPU=on -DCUDA_TOOLKIT_ROOT_DIR=/share/apps/cuda/9.0/ -DGMX_CUDA_TARGET_SM="30;37;70" \ -DGMX_SIMD=AVX2_256 -DCMAKE_INSTALL_PREFIX=/share/apps/gromacs/2019.3_intel-17_cuda-9.0 \ -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=ON -DGMX_EXTERNAL_BLAS=on -DREGRESSIONTEST_DOWNLOAD=on
Note
The above command will enable the GPU usage with CUDA for the specified architecures, sm_30 and sm_37 for Tesla K80 and sm_70 for V100 because these are the GPUs used in Apolo. [2]
Note
For “FFT_LIBRARY” there are some options like Intel MKL. Generally, it is recommended to use the FFTW because there is no advantage in using MKL with GROMACS, and FFTW is often faster. [1]
To build the distributed GROMACS version you have to use an MPI library. The GROMACS team recommends OpenMPI version 1.6 (or higher), MPICH version 1.4.1 (or higher).
$ module load cmake/3.7.1 \ cuda/9.0 \ openblas/0.2.19_gcc-5.4.0 \ openmpi/1.10.7_gcc-5.4.0 \ python/2.7.15_miniconda-4.5.4
$ cmake .. -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -DGMX_MPI=on -DGMX_GPU=on \ -DCUDA_TOOLKIT_ROOT_DIR=/share/apps/cuda/9.0/ -DGMX_CUDA_TARGET_SM="30;37;70" \ -DGMX_SIMD=AVX2_256 -DCMAKE_INSTALL_PREFIX=/share/apps/gromacs/2019.3_intel-17_cuda-9.0 \ -DGMX_FFT_LIBRARY=fftw3 -DGMX_BUILD_OWN_FFTW=ON -DGMX_EXTERNAL_BLAS=on -DREGRESSIONTEST_DOWNLOAD=on
For more information about the compile options you can refer to the Gromacs Documentation. [1]
Execute the make commands sequence.
$ make -j <N> $ make check $ make -j <N> install
Warning
Some tests may fail, but the installation can continue depending on the number of failed tests.
Usage
This section describes a way to submit jobs with the resource manager SLURM.
Load the necessary environment.
# Apolo module load gromacs/2019.3_intel-17_cuda-9.0 # Cronos module load gromacs/2016.4_gcc-5.5.0
Run Gromacs with SLURM.
An example with GPU (Apolo), given by one of our users:
1#!/bin/bash 2 3#SBATCH --job-name=gmx-GPU 4#SBATCH --nodes=1 5#SBATCH --ntasks-per-node=8 6#SBATCH --cpus-per-task=4 7#SBATCH --time=10:00:00 8#SBATCH --partition=accel-2 9#SBATCH --gres=gpu:2 10#SBATCH --output=gmx-GPU.%j.out 11#SBATCH --error=gmx-GPU.%j.err 12 13module load gromacs/2019.3_intel-17_cuda-9.0 14 15export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK 16 17gmx grompp -f step6.0_minimization.mdp -o step6.0_minimization.tpr -c step5_charmm2gmx.pdb -r step5_charmm2gmx.pdb -p topol.top 18gmx mdrun -v -deffnm step6.0_minimization -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01 19 20# Equilibration 21cnt=1 22cntmax=6 23 24while [ $cnt -le $cntmax ]; do 25 pcnt=$((cnt-1)) 26 if [ $cnt -eq 1 ]; then 27 gmx grompp -f step6.${cnt}_equilibration.mdp -o step6.${cnt}_equilibration.tpr -c step6.${pcnt}_minimization.gro -r step5_charmm2gmx.pdb -n index.ndx -p topol.top 28 gmx mdrun -v -deffnm step6.${cnt}_equilibration -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01 29 else 30 gmx grompp -f step6.${cnt}_equilibration.mdp -o step6.${cnt}_equilibration.tpr -c step6.${pcnt}_equilibration.gro -r step5_charmm2gmx.pdb -n index.ndx -p topol.top 31 gmx mdrun -v -deffnm step6.${cnt}_equilibration -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01 32 fi 33 ((cnt++)) 34done 35 36# Production 37cnt=1 38cntmax=10 39 40while [ $cnt -le $cntmax ]; do 41 if [ $cnt -eq 1 ]; then 42 gmx grompp -f step7_production.mdp -o step7_${cnt}.tpr -c step6.6_equilibration.gro -n index.ndx -p topol.top 43 gmx mdrun -v -deffnm step7_${cnt} -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01 44 else 45 pcnt=$((cnt-1)) 46 gmx grompp -f step7_production.mdp -o step7_${cnt}.tpr -c step7_${pcnt}.gro -t step7_${pcnt}.cpt -n index.ndx -p topol.top 47 gmx mdrun -v -deffnm step7_${cnt} -ntmpi $SLURM_NTASKS -ntomp $SLURM_CPUS_PER_TASK -gpu_id 01 48 fi 49 ((cnt++)) 50done
Note lines 18, 28, 31, 43, 47 the use of
gmx mdrunwith the flag-gpu_id 01:If Gromacs was compiled with Cuda, it will use the GPUs available by default.
The flag
-gpu_id 01tells Gromacs which GPUs can be used. The01means use GPU with device ID 0 and GPU with device ID 1.Note in line 9 the use of
#SBATCH --gres=gpu:2.gresstands for generic resource scheduling.gpurequests GPUs to Slurm, and:2specifies the quantity.Note that we have 3 GPUs in Accel-2, but we are indicating only two GPUs. This is useful when some other user is using one or more GPUs.
Also, note that the number of tasks per node must be a multiple of the number of GPUs that will be used.
Setting a
cpus-per-taskto a value between 2 and 6 seems to be more efficient than values greather than 6.The files needed to run the example above are
here.For more information see [3].
An example with CPU only (Cronos):
1#!/bin/bash 2 3################################################################################ 4################################################################################ 5# 6# Find out the density of TIP4PEW water. 7# How to run the simulation was taken from: 8# https://www.svedruziclab.com/tutorials/gromacs/1-tip4pew-water/ 9# 10################################################################################ 11################################################################################ 12 13#SBATCH --job-name=gmx-CPU 14#SBATCH --nodes=4 15#SBATCH --ntasks-per-node=16 16#SBATCH --time=03:00:00 17#SBATCH --partition=longjobs 18#SBATCH --output=gmx-CPU.%j.out 19#SBATCH --error=gmx-CPU.%j.err 20#SBATCH --mail-user=example@eafit.edu.co 21#SBATCH --mail-type=END,FAIL 22 23module load gromacs/2016.4_gcc-5.5.0 24 25# Create box of water. 26gmx_mpi solvate -cs tip4p -o conf.gro -box 2.3 2.3 2.3 -p topol.top 27 28# Minimizations. 29gmx_mpi grompp -f mdp/min.mdp -o min -pp min -po min 30srun --mpi=pmi2 gmx_mpi mdrun -deffnm min 31 32gmx_mpi grompp -f mdp/min2.mdp -o min2 -pp min2 -po min2 -c min -t min 33srun --mpi=pmi2 gmx_mpi mdrun -deffnm min2 34 35# Equilibration 1. 36gmx_mpi grompp -f mdp/eql.mdp -o eql -pp eql -po eql -c min2 -t min2 37srun --mpi=pmi2 gmx_mpi mdrun -deffnm eql 38 39# Equilibration 2. 40gmx_mpi grompp -f mdp/eql2.mdp -o eql2 -pp eql2 -po eql2 -c eql -t eql 41srun --mpi=pmi2 gmx_mpi mdrun -deffnm eql2 42 43# Production. 44gmx_mpi grompp -f mdp/prd.mdp -o prd -pp prd -po prd -c eql2 -t eql2 45srun --mpi=pmi2 gmx_mpi mdrun -deffnm prd
Note the use of
gmx_mpiinstead ofgmx.Also, note the use of
srun --mpi=pmi2instead ofmpirun -np <num-tasks>. The commandsrun --mpi=pmi2gives togmx_mpithe context of where and how many tasks to run.In lines 13 and 14 note that it is requesting 4 nodes and 16 mpi tasks on each node. Recall that each node in Cronos has 16 cores.
In lines 16, 29, 32, 36, 40, 44 note that
srun --mpi=pmi2is not used. This is due that, those are preprocessing steps, they do not need to run distributedly.The needed files to run the example simulation can be found
here.