Tested on (Requirements)

  • OS base: CentOS (x86_64) \(\boldsymbol{\ge}\) 6.6
  • Compiler: Intel Parallel Studio XE Compiler Cluster Edition \(\boldsymbol{\ge}\) 17.0.1.
  • Optional extensions:
    • Mxscarna (Included in the mafft-7.402-with-extensions-src.tgz package)
    • Foldalign \(\boldsymbol{\ge}\) 2.5.1
    • Contrafold \(\boldsymbol{\ge}\) 2.02

Build process

Note

  • In Apolo II was used the Intel Compiler 17.0.1.
    • module load intel/2017_Update-1
  • In Cronos was used the Intel Compiler 18.0.2.
    • module load intel/18.0.2

This entry described the installation process of MAFFT with extensions.

  1. Get the MAFFT with the extensions package.

    wget https://mafft.cbrc.jp/alignment/software/mafft-7.402-with-extensions-src.tgz
    tar xfvz mafft-7.402-with-extensions-src.tgz
    
  2. Edit MAFFT’s Makefile with the following lines.

    mafft-7.402-with-extensions/core/Makefile

    From:

    PREFIX = /usr/local
    ...
    CC = gcc
    #CC = icc
    CFLAGS = -03
    ...
    

    To:

    PREFIX = /your/path
    ...
    #CC = gcc
    CC = icc
    CFLAGS = -03 -fast
    ...
    
  3. Load the necessary environment and build it.

    module load intel/2017_update-1
    make clean
    make
    make install
    

Extensions

This entry described the extension’s installation process.

Mxscarna

MXSCARNA [1]. (Multiplex Stem Candidate Aligner for RNAs) is a multiple alignment tool for RNA sequences using progressive alignment based on the pairwise structural alignment algorithm of SCARNA. This software is fast enough for large scale analyses, while the accuracies of the alignments are better than or comparable with the existing algorithms which are computationally much more expensive in time and memory.

  1. Edit the Mxcarna’s Makefile with the following lines.

    mafft-7.402-with-extensions/extensions/mxscarna_src/Makefile

    Makefile

    From:

    ...
    CXX = g++
    ...
    

    To:

    ...
    CXX = icpc
    ...
    
  2. Load the necessary environment and build it.

    cd ../
    module load intel/2017_update-1
    make clean
    make
    make install
    
  3. Move the binaries to libexec MAFFT directory.

    cp mxscarna /you/path/to/maft/libexec/mafft/
    

Foldalign

FOLDALIGN [2]. an algorithm for local or global simultaneous folding and aligning two or more RNA sequences and is based on the Sankoffs algorithm (SIAM J. Appl. Math., 45:810-825, 1985). Foldalign can make pairwise local or global alignments and structure predictions. FoldalignM makes multiple global alignment and structure prediction.

  1. Get the Foldalign package and move it to the MAFFT extension’s directory.

    wget https://rth.dk/resources/foldalign/software/foldalign.2.5.1.tgz
    tar xfvz foldalign.2.5.1.tgz
    cp foldalign mafft-7.402-with-extensions/extensions/
    cd mafft-7.402-with-extensions/extensions/foldalign
    
  2. Edit Foldalign’s Makefile with the following lines.

    mafft-7.402-with-extensions/src/mafft-7.402-with-extensions/extensions/foldalign/Makefile

    Makefile

    from:

    ...
    cc = g++
    ...
    

    To:

    ...
    cc = icpc
    ...
    
  3. Load the necessary environment and build it.

    module load intel/2017_update-1
    make clean
    make
    
  4. Move the binaries to libexec MAFFT directory.

    cp bin/* /you/path/to/maft/libexec/mafft/
    

Contrafold

CONTRAfold [3]. is a novel secondary structure prediction method based on conditional log-linear models (CLLMs), a flexible class of probabilistic models that generalize upon SCFGs by using discriminative training and feature-rich scoring. By incorporating most of the features found in typical thermodynamic models, CONTRAfold achieves the highest single sequence prediction accuracies to date, outperforming currently available probabilistic and physics-based techniques.

  1. Get the Contrafold package and move it to the MAFFT extension’s directory.

    wget http://contra.stanford.edu/contrafold/contrafold_v2_02.tar.gz
    tar xfvz contrafold_v2_02
    cp  contrafold_v2_02/contrafold mafft-7.402-with-extensions/extensions/
    cd mafft-7.402-with-extensions/extensions
    
  2. load the necessary environment and build it.

    cd contrafold/src/
    module load intel/2017_update-1
    module load openmpi/1.8.8-x86_64_intel-2017_update-1
    make clean
    make intelmulti
    
  3. Move the binaries to libexec MAFFT directory.

    cp contrafold /you/path/to/maft/libexec/mafft/
    

Troubleshooting

When you try to compile contrafold, it prints:

perl MakeDefaults.pl contrafold.params.complementary contrafold.params.noncomplementary contrafold.params.profile
g++ -O3 -DNDEBUG -W -pipe -Wundef -Winline --param large-function-growth=100000 -Wall  -c Contrafold.cpp
In file included from LBFGS.hpp:52,
              from InnerOptimizationWrapper.hpp:12,
              from OptimizationWrapper.hpp:12,
              from Contrafold.cpp:16:
LBFGS.ipp: En la instanciación de ‘Real LBFGS<Real>::Minimize(std::vector<_Tp>&) [con Real = double]’:
OptimizationWrapper.ipp:260:9:   se requiere desde ‘void OptimizationWrapper<RealT>::LearnHyperparameters(std::vector<int>, std::vector<_Tp>&) [con RealT = double]’
Contrafold.cpp:451:9:   se requiere desde ‘void RunTrainingMode(const Options&, const std::vector<FileDescription>&) [con RealT = double]’
Contrafold.cpp:68:54:   se requiere desde aquí
LBFGS.ipp:110:33: error: ‘DoLineSearch’ no se declaró en este ámbito, y no se encontraron declaraciones en la búsqueda dependiente de argumentos en el punto de la instanciación [-fpermissive]
      Real step = DoLineSearch(x[k%2], f[k%2], g[k%2], d,
                  ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
                               x[(k+1)%2], f[(k+1)%2], g[(k+1)%2],
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                               Real(0), std::min(Real(10), MAX_STEP_NORM / std::max(Real(1), Norm(d))));
                               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LBFGS.ipp:110:33: nota: no se encontraron declaraciones en la base dependiente ‘LineSearch<double>’ pur la búsqueda no calificada
LBFGS.ipp:110:33: nota: use ‘this->DoLineSearch’ en su lugar
make: *** [Makefile:47: Contrafold.o] Error 1

Or something similar about a compilation error, it appears because in Utilities.hpp is missing an include.

  1. Edit Utilities.hpp and add the limits.h library.

    mafft-7.402-with-extensions/extensions/contrafold/src/Utilities.hpp

    Utilities.hpp

    from:

    #define UTILITIES_HPP
    
    #include <algorithm>
    ...
    

    To:

    #define UTILITIES_HPP
    
    #include <limits.h>
    #include <algorithm>
    ...
    
  2. Repeat step 2.

Module Files

Apolo II

#%Module1.0####################################################################
##
## module load mafft/7.402-with-extensions_intel-17.0.1
##
## /share/apps/modules/mafft/7.402-with-extensions_intel-17.0.1
## Written by Manuela Carrasco Pinzon
##

proc ModulesHelp {} {
     global version modroot
     puts stderr "Sets the environment for using mafft 7.402-with-extensions\
		  \nin the shared directory /share/apps/mafft/7.402-with-extensions/intel-17.0.1/\
		  \nbuilded with Intel Parallel Studio XE Cluster Edition 2017 Update 1."
}

module-whatis "(Name________) mafft"
module-whatis "(Version_____) 7.402-with-extensions"
module-whatis "(Compilers___) intel-17.0.1"
module-whatis "(System______) x86_64-redhat-linux"
module-whatis "(Libraries___) "

# for Tcl script use only
set         topdir        /share/apps/mafft/7.402-with-extensions/intel-17.0.1
set         version       7.402-with-extensions
set         sys           x86_64-redhat-linux

conflict mafft
module load intel/2017_update-1
module load openmpi/1.8.8-x86_64_intel-2017_update-1
 

prepend-path	PATH			$topdir/bin
prepend-path	PATH			$topdir/libexec

prepend-path	MANPATH			$topdir/share/man

Cronos

#%Module1.0####################################################################
##
## module load mafft/7.402-with-extensions_intel-18.0.2
##
## /share/apps/modules/mafft/7.402-with-extensions_intel-18.0.2
## Written by Manuela Carrasco Pinzon
##

proc ModulesHelp {} {
     global version modroot
     puts stderr "Sets the environment for using mafft 7.402-with-extensions\
                  \nin the shared directory /share/apps/mafft/7.402-with-extensions/intel-18.0.2/\
                  \nbuilded with Intel Parallel Studio XE Cluster Edition 2018."
}

module-whatis "(Name________) mafft"
module-whatis "(Version_____) 7.402-with-extensions"
module-whatis "(Compilers___) intel-18.0.2"
module-whatis "(System______) x86_64-redhat-linux"
module-whatis "(Libraries___) "

# for Tcl script use only
set         topdir        /share/apps/mafft/7.402-with-extensions/intel-18.0.2
set         version       7.402-with-extensions
set         sys           x86_64-redhat-linux

conflict mafft
module load intel/18.0.2
module load openmpi/3.1.1_intel-18.0.2 


prepend-path    PATH                    $topdir/bin
prepend-path    PATH                    $topdir/libexec

prepend-path    MANPATH                 $topdir/share/man
[1]MXSCARNA. (n.d.). Retrieved August 10, 2018, from https://www.ncrna.org/softwares/mxscarna/
[2]Foldalign: RNA Structure and Sequence Alignment. (n.d.). From https://rth.dk/resources/foldalign/
[3]Do, C., & Marina, S. (n.d.). Contrafold: CONditional TRAining for RNA Secondary Structure Prediction. From http://contra.stanford.edu/contrafold/