GATK4-4.1.0.0

Basic Information

Installation

This entry covers the entire process performed for the installation GATK4 on a cluster.

Usage

This section describes the method to submit jobs with the resource manager SLURM.

  1. Load the necessary environment.

    module load gatk4/4.1.0.0
    
  2. Run GATK4 with SLURM.

    An example:

    For this example, we use the GATK tutorial Run the Pathseq pipeline

    #!/bin/sh                                                                                                                                                                                                                                     
    
    #SBATCH --partition=longjobs                                                                                                                                                                                                                  
    #SBATCH --nodes=1                                                                                                                                                                                                                             
    #SBATCH --ntasks-per-node=1                                                                                                                                                                                                                   
    #SBATCH --time=05:00                                                                                                                                                                                                                          
    #SBATCH --job-name=gatk_example                                                                                                                                                                                                               
    #SBATCH -o gatk4_%j.out                                                                                                                                                                                                                   
    #SBATCH -e gatk4_%j.err                                                                                                                                                                                                                   
    #SBATCH --mail-type=END,FAIL                                                                                                                                                                                                                  
    #SBATCH --mail-user=youremail@email.com
    
    # Don't share environment variables                                                                                                                                                                                                           
    
    module load gatk4/4.1.0.0
    
    gatk PathSeqPipelineSpark \
        --input test_sample.bam \
        --filter-bwa-image hg19mini.fasta.img \
        --kmer-file hg19mini.hss \
        --min-clipped-read-length 70 \
        --microbe-fasta e_coli_k12.fasta \
        --microbe-bwa-image e_coli_k12.fasta.img \
        --taxonomy-file e_coli_k12.db \
        --output output.pathseq.bam \
        --scores-output output.pathseq.txt
    

    Note

    If you want to run some tests, go to the GATK4 page with tutorials.

Authors