NGSeasy_logo

NGSeasy (beta): A Dockerized NGS pipeline and tool-box

With NGSeasy you can now have full suite of NGS tools up and running on any high end workstation in an afternoon

Authors: Stephen J Newhouse and Amos Folarin
Release Version: 1.0-r001
Release: dirty_tango
Publication: pending

NGSeasy-1.0 Full Production release will be available Late 2015
NGSeasy-1.0-r001 (dirty_tango) contains most of the core functionality to go from raw fastq to raw vcf calls
NGSeasy will update every 12 months
GUI in development
Lets us know if you want other tools added to NGSeasy

NGSeasy is completely open source and we encourage interested folks to jump in and get involved in the dev with us.

NGSeasy Genome Comparison & Analytic Testing (GCAT) Reports

Here we provide a quick look at basic NGSeasy performance (more results coming soon).

GCAT Report	Test Data	Pipeline
[NGSEASY-NTRIM-BWA-FREEBAYES-D] (http://www.bioplanet.com/gcat/reports/6167-seeirhwtfp/variant-calls/illumina-100bp-pe-exome-150x/ngseasy-ntrim-bwa-freebayes-d/compare-570-270-181/group-read-depth)	illumina-100bp-pe-exome-150x	fastq > bwa > freebayes

Author Contact Details

Please contact us for help/guidance on using the beta release.

Author	email	Twitter	LinkedIn
Dr Stephen J Newhouse	stephen.j.newhouse@gmail.com	@s_j_newhouse	View Steve's profile on LinkedIn
Dr Amos Folarin	amosfolarin@gmail.com	@amosfolarin	View Amos's profile on LinkedIn

Issues, Questions and Queries

Please Direct all queries to [https://github.com/KHP-Informatics/ngseasy/issues]

When sending bug reports etc please provide:-

Date of Download
OS and version
Basic Machine Specs (CPU, RAM)
Network Speed (Testing Internet Connection Speed)
The Code you ran eg:- ngseasy -c my.config.tsv -d /My/Dir
The exact error as printed to screen

WARNING! NGSeasy is not numpty or bad data proof!

Please read the docs, stay calm, take your time and think about what you are doing...and if [www.google.com] doesnt help, then please direct all queries to [https://github.com/KHP-Informatics/ngseasy/issues].

Docker Security...

This post reviews the various security implications of using Docker to run applications within containers, and how to address them: How Secure are Containers?

Docker containers are, by default, quite secure; especially if you take care of running your processes inside the containers as non-privileged users (i.e. non root).

NGSeasy Security

All NGSeasy applications are run as the non-root user pipeman within each container

Install Docker

Full instructions at https://docs.docker.com/.

Some fixes to make life easy...allows you to run docker without sudo.

This may differ for your OS, and mostly applies to flavours of Linux. Check with your sys admin or just Google https://www.google.com.

MAC/Windows users using http://boot2docker.io/ should be fine. Read the docs or just Google https://www.google.com.

Create a docker group

sudo addgroup docker

Add user to docker group

Here user is ec2-user

sudo usermod -aG docker ec2-user

Log out and log back in.

This ensures your user is running with the correct permissions.

Verify your work by running docker without sudo.

docker run hello-world

..this is what you should get...

Unable to find image 'hello-world:latest' locally
Pulling repository hello-world
91c95931e552: Download complete
a8219747be10: Download complete
Status: Downloaded newer image for hello-world:latest
Hello from Docker.
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (Assuming it was not already locally available.)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

For more examples and ideas, visit:
 http://docs.docker.com/userguide/

Get NGSeasy


#############################################
## Get NGSeasy                             ##
#############################################

cd /home/${USER}

git clone https://github.com/KHP-Informatics/ngseasy.git

Install NGSeasy

Default install directory is /home/${USER}
in this example user home is /home/ec2-user
make INTSALLDIR="/home/ec2-user" all
sets up top level directory structure
gets all docker images
gets indexed hg19 and b37 genomes
gets GATK recources for hg19 and b37 genomes
gets whole genome and exome test data
Always set your INTSALLDIR : If you run sudo make all the install path will be /home/root. Please dont do this!
sudo make install installs scripts to /usr/local/bin/

#############################################
## install NGSeasy                         ##
#############################################

cd ngseasy

## 1.
make INTSALLDIR="/home/ec2-user" all

## 2. 
sudo make install

Installation can take a while, 1-2 hours, so go get a coffee../just chill...if your network is bad...then who knows how long...still..just chill...or go get fast internet!

Recommended Network Speed

> 500 Mbit/s : anything less will add a lot of time to set up (days - weeks).

Testing Internet Connection Speed

source : http://askubuntu.com/questions/104755/how-to-check-internet-speed-via-terminal

wget -O speedtest-cli https://raw.github.com/sivel/speedtest-cli/master/speedtest_cli.py
chmod +x speedtest-cli
./speedtest-cli

Retrieving speedtest.net configuration...
Retrieving speedtest.net server list...
Testing from Comcast Cable (x.x.x.x)...
Selecting best server based on ping...
Hosted by FiberCloud, Inc (Seattle, WA) [12.03 km]: 44.028 ms
Testing download speed........................................
Download: 32.29 Mbit/s
Testing upload speed..................................................
Upload: 5.18 Mbit/s

Install time on Amazon EC2

Connection Speed: ~ 800 Mbit/s

real    94m54.237s
user    12m26.960s
sys     28m46.648s

Note: We have only tested NGSeasy installation on Amazon EC2, Openstack and UK University Networks. These are all fairly fast networks with speeds exceeding 800 Mbit/s on average.

Running NGSeasy for the first time on the test data

Important! NGSeasy is controlled from a single config file. See ngseasy_test.config.tsv for a basic template. It is important that the user sets this up properly before running NGSeasy.

#############################################
## 0. Move to config file dir

cd /home/ec2-user/ngs_projects/config_files/

#############################################
## 1. Run basic test

ngseasy -c ngseasy_test.config.tsv -d /home/ec2-user/ngs_projects

What should happen...

This runs the following basic pipeline on Whole Exome PE 30x Illumina data, aligning to b37 (in theory...give it a try).

FastQC > Trimmomatic > BWA > Platypus

Some notes and pointers

Edit NCPU in [ngseasy_test.config.tsv] to suit your system
Edit PROJECT_DIR in [ngseasy_test.config.tsv] to suit your install path
We expect the user to palce all raw fastq files in raw_fastq. NGSeasy uses this as a stagging area for new project and sample data.
right now, always run ngseasy from the location/directory that contains the config.file
each component of ngseasy can be run as a standalone script

NGSeasy (Easy Analysis of Next Generation Sequencing)

We present NGSeasy (Easy Analysis of Next Generation Sequencing), a flexible and easy-to-use NGS pipeline for automated alignment, quality control, variant calling and annotation. The pipeline allows users with minimal computational/bioinformatic skills to set up and run an NGS analysis on their own samples, in less than an afternoon, on any operating system (Windows, iOS or Linux) or infrastructure (workstation, cluster or cloud).

NGS pipelines typically utilize a large and varied range of software components and incur a substantial configuration burden during deployment which limits their portability to different computational environments. NGSeasy simplifies this by providing the pipeline components encapsulated in Docker™ containers and bundles in a wide choice of tools for each module. Each module of the pipeline represents one functional grouping of tools (e.g. sequence alignment, variant calling etc.).

Deploying the pipeline is as simple as pulling the container images from the public repository into any host running Docker. NGSeasy can be deployed on any medium to high-end workstation, high performance computer cluster and compute clouds (public/private cloud computing) - enabling instant access to elastic scalability without investment overheads for additional compute hardware and makes open and reproducible research straight forward for the greater scientific community.

Advantages

Easy to use for non-informaticians.
All run from a single config file that can be made in Excel.
User can select from mutiple aligners, variant callers and variant annotators
No scary python, .yaml or .json files...just one simple Excel workbook saved as a textfile.
Just follow our simple set of instructions and NGS away!
Choice of aligners and variant callers and anntators
Allows reproducible research
Version controlled for auditing
Customisable
Easy to add new tools
If it's broke...we will fix it..
Enforced naming convention and directory structures
Allows users to run "Bake Offs" between tools with ease

We have adapted the current best practices from the Genome Analysis Toolkit (GATK, http://www.broadinstitute.org/gatk/guide/best-practices) for processing raw alignments in SAM/BAM format and variant calling. The current workflow, has been optimised for Illumina platforms, but can easily be adapted for other sequencing platforms, with minimal effort.

As the containers themselves can be run as executables with pre-specified cpu and RAM resources, the orchestration of the pipeline can be placed under the control of conventional load balancers if this mode is required.

Overview of the NGSeasy Pipeline Components

The basic pipeline contains all the basic tools needed for manipulation and quality control of raw fastq files (ILLUMINA focused), SAM/BAM manipulation, alignment, cleaning (based on GATK best practises [http://www.broadinstitute.org/gatk/guide/best-practices]) and first pass variant discovery. Separate containers are provided for indepth variant annotation, structural variant calling, basic reporting and visualisations.

ngsEASY

A Special note on the NGSeasy base image.

We include the following - what we think of as - NGS Powertools in the compbio/ngseasy-base image. These are all tools that allow the user to slice and dice BED/SAM/BAM/VCF files in multiple ways.

samtools
bcftools
vcftools
vcflib
bamUtil
bedtools2
ogap
samblaster
sambamba
bamleftalign
seqtk
parallel

This image is used as the base of all our compbio/ngseasy-* tools.

Why not a separate containers per application? The more docker-esque approach, would be to have separate containers for each NGS tool. However, this belies the fact that many of these tools interact in a deep way. Therefore, we built these into a single development environment for ngseasy, to allow pipes and streamlined system calls for manipulating the output of NGS pipelines (BED/SAM/BAM/VCF files).

The Full NGSeasy pipeline

The NGSeasy pipelines implement the following :-

Quality control of raw fastq files using FASTQC
Read trimming using TRIMMOMATIC.
Alignment using one of
- BWA
- STAMPY
- NOVOALIGN
- BOWTIE2
- SNAP
SAM/BAM sorting and indexing with SAMBAMBA.
Read Group information added using PICARDTOOLS:AddOrReplaceReadGroups
Duplicate marking with SAMBLASTER.

For academic users and/or commercial/clinical groups whom have paid for GATK licensing, the next steps are to perform

Indel indel realignment and base quality score recalibration using GATK built in tools :
- GATK:RealignerTargetCreator
- GATK:IndelRealigner
- GATK:BaseRecalibrator

For the non-GATK version

Base quality score recalibration using BamUtil
- BamUtil:recab
Post alignment quality control and reporting is performed usng a number of tools and custom scripts:
- SAMTOOLS:flagstats
- BEDTOOLS:genomecov
- BEDTOOLS:bamtobed
- PICARDTOOLS:CollectMultipleMetrics
- PICARDTOOLS:CollectAlignmentSummaryMetrics
- PICARDTOOLS:CollectWgsMetrics
- PICARDTOOLS:CollectTargetedPcrMetrics (coming soon)
SNP and small INDEL calling using one of the following or a combibation of these tools, if the ensemble method is called using bcbio.variation variant-ensemble
- FREEBAYES
- PLATYPUS
- GATK:UnifiedGenotyper
- GATK:HaplotypeCaller
Structural Variant (CNV) calling using one of the following or or a combibation of if the ensemble methods are called:-
- DELLY : in Dev
- LUMPY: in Dev
- cn.MOPS: in Dev
- m-HMM: in Dev
- ExomeDepth: in Dev
- SLOPE: in Dev and not tested
- cnvnator: in Dev
Variant annotation using using one of the following or a combibation of if the ensemble methods are called.
- SnpEff: in Dev
- ANNOVAR: in Dev
- VEP: in Dev
Variant reporting using custom scripts

Note Some of the later functions i.e. variant annotation and qc reporting are still in dev.

We highly recommed read trimming prior to alignment. We have noticed considerable speed-ups in alignmnet time and increased quality of SNP/INDEL calls using trimmed vs raw fastq.

Base quality score recalibration is also recommended.
As an alternative to GATK, we have added fucntionality for use of BamUtil:recab for base quality score recalibration.

Non-GATK users - are encouraged to use aligners such as stampy and novoalign that perform base quality score recal on the fly.
- are encouraged to use variant callers that perform local re-aligmnet around candidate sites to mitigate the need for the indel realignment stages.
- freebayes - platypus

Dockerised NGS Tools

All NGSeasy Docker images can be pulled down from compbio Docker Hub or using the Makefile.
We provide an Amazon EBS data volume with indexed genomes: XXXXXX

Table 1. NGSeasy Tools

Docker Image	Version	NGS Tool (version)	Short Description	URL
compbio/ngseasy-base	1.0-r001	VCFtools (v0.1.12b)	manipulate vcf	link
-	-	vt (latest)	manipulate vcf	link
-	-	bcftools (1.2-5-g7fa0d25)	manipulate vcf	link
-	-	vcflib (v1.0.0)	manipulate vcf	link
-	-	samtools (1.2-17-ge91985a)	manipulate sam/bam	link
-	-	samblaster (0.1.21)	manipulate sam/bam	link
-	-	sambamba (v0.5.1)	manipulate sam/bam	link
-	-	bamUtil (1.0.13)	manipulate sam/bam	link
-	-	bedtools (v2.23.0-10-g447cb97)	manipulate bed files	link
-	-	seqtk (1.0-r77-dirty)	manipulate fastq	link
-	-	vawk (0.0.2)	manipulate vcf	link
-	-	bioawk (latest)	manipulate sam/bam/vcf	link
compbio/ngseasy-fastqc	1.0-r001	fastqc (v0.11.2)	FASTQ Quality Control Plots	link
compbio/ngseasy-trimmomatic	1.0-r001	trimmomatic (0.32)	FASTQ Quality Trimming	link
compbio/ngseasy-bwa	1.0-r001	bwa ( 0.7.12-r1039)	Aligner	link
compbio/ngseasy-stampy	1.0-r001	stampy (stampy-1.0.27)	Aligner	link
compbio/ngseasy-snap	1.0-r001	snap-aligner (1.0beta.18)	Aligner	link
compbio/ngseasy-bowtie2	1.0-r001	bowtie2 (2.2.4)	Aligner	link
compbio/ngseasy-novoalign	1.0-r001	novoalign (3.02.13)	Aligner	link
compbio/ngseasy-gatk	1.0-r001	gatk (3.4-0)	NGS PowerTools	link
compbio/ngseasy-picardtools	1.0-r001	picardtools (1.128)	NGS PowerTools	link
compbio/ngseasy-glia	1.0-r001	glia (latest)	NGS local realignment	link
compbio/ngseasy-platypus	1.0-r001	platypus (0.8.1)	Variant Caller	link
compbio/ngseasy-freebayes	1.0-r001	freebayes (v0.9.21-19-gc003c1e)	Variant Caller	link

Running an NGSeasy Tool Interactively

Run as non-root user pipeman.

-v /media/Data:/home/pipeman : Mounts local directory /media/Data to container directory /home/pipeman

TOOL="bwa"

docker run \
-P \
-w /home/pipeman \
-e HOME=/home/pipeman \
-e USER=pipeman \
--user pipeman \
-v /media/Data:/home/pipeman \
-it compbio/ngseasy-${TOOL}:1.0 /bin/bash

Dockerised NGSeasy

docker

The following section describes getting the Dockerised NGSeasy Pipeline(s) and Resources, project set up and running NGSeasy.

Getting all resources and building required tools will take a few hours depending on network connections and any random "ghosts in the machine" - half a day in reality. But once you're set up, thats it - you are good to go.

System Requirements

See Table System Requirements for our recommended system requirements.NGSeasy will run on any modern computer/workstation or cloud infrastructure. The Hard Disk requirements are based on our experience and result from the fact that the pipeline/tools produce a range of intermediary and temporary files for each sample.

The full NGSeasy install includes indexed genomes for hg19 and b37 for all aligners, annotation files from GATK resource, and all of the NGSeasy docker images. Additional disk space is needed if the user wishes to install the databases associated with the variant annotators, Annovar, VEP and snpEff.

Based on our experience, a functional basic NGS compute system for a small lab, would consist of at least 4TB disk space, 60GB RAM and at least 32 CPU cores. Internet speed and network connectivity are a major bottle neck when dealing with NGS sized data, and groups are encouraged to think about these issues before embarking on multi sample or population level studies - where compute requirements can very quickly escalate.

System Requirements

Component	Minimum	Recommended
RAM	16GB	48-60GB
CPU	8 cores	16-36 cores
Hard Disk (per sample)	50-100GB	200-500GB
NGSeasy Install	200GB	500GB
Annotation Databases	500GB	>1TB

Installing Docker

Follow the simple instructions in the links provided below

A full set of instructions for multiple operating systems are available on the Docker website.

Getting NGSeasy

We provide a simple Makefile to pull all of the public nsgeasy components, scripts and set up to correct project directory structre on your local machines.

Setting up the initial project can take up a day, depending on your local network connections and speeds.

The default install dir is the users ${HOME} directory. The Makefile provides options to install to any user defined directory and select NGSeasy version. eg :-

## EG. Installing to /media/scratch
make INSTALLDIR="/media/scratch" VERSION="1.0" all

The Makefile also allows installation of selected components (check out its insides!).

Set up NGSeasy Project configuration file

Using Excel or something, make a [config.file.tsv] file and save as [TAB] a Delimited file with .tsv extenstion. This sets up Information related to: Project Name, Sample Name, Library Type, Pipeline to call, NCPU.

We provide a template that can be used with NGSeasy, see ngseasy_test.config.tsv.

The [config.file.tsv] should contain the following 23 columns for each sample to be run through a pipeline:-

Variable	type	Description	Options(Examples)
PROJECT_ID	STRING	Project ID	Cancer
SAMPLE_ID	STRING	Sample ID	SAMPLE_I
FASTQ1	STRING	Read 1 Fastq	foo_R1.fq.gz
FASTQ2	STRING	Read 2 Fastq	foo_R2.fq.gz
PROJECT_DIR	STRING	ngseasy project dir	/media/scratch/ngs_projects
DNA_PREP_LIBRARY_ID	STRING	NGS Library
NGS_PLATFORM	STRING	NGS Platform	ILLUMINA
NGS_TYPE	STRING	NGS Type	WEX (exome), WGS (genome), TGS (targeted)
BAIT	STRING	bait bed file	FOO.bed
CAPTURE	STRING	Capture bed file	BAR.bed
GENOMEBUILD	STRING	genome verison	hg19, b37 , b38 (coming soon)
FASTQC	STRING	Select fastqc	no-fastqc, qc-fastqc
TRIM	STRING	Select trimming	no-trimm, atrimm, btrimm
BSQR	STRING	Select BSQR	no-bsqr, bam-bsqr, gatk-bsqr
REALN	STRING	Select Realignment	no-realn, bam-realn, gatk-realn
ALIGNER	STRING	Select Aligner	no-aln, bwa, stampy, snap, novoalign, bowtie2
VARCALLER	STRING	Select Variant Caller	no-varcall, freebayes, platypus, UnifiedGenotyper, HaplotypeCaller, ensemble
CNV	STRING	Select CNV caller	no-sv,all-sv,lumpy,delly,slope,exomedepth,mhmm,cnvnator
ANNOTATOR	STRING	Select variant annotator	no-anno,snpeff,annovar,vep
CLEANUP	STRING	clean up temp files	TRUE, FALSE
NCPU	NUMBER	number of cores	1 .. N
VERSION	NUMBER	NGSeasy version	1.0
NGSUSER	STRING	user email	stephen.j.newhouse@gmail.com

The NGSeasy project directory

The user needs to make the relevent directory structures on their local machine before starting an NGS run.

On our sysetm we typically set up a top-level driectory called ngs_projects within which we store output from all our individual NGS projects.

Here we are working from local top level directory called media/, but this can really be any folder on your local system ie your home directory ~/${USER}.

Within this directory media we make the following folders: -

ngs_projects  
|  
|__raw_fastq  
|__config_files  
|__ngseasy_resources  
   |  
   |__reference_genomes_b37  
   |__reference_genomes_hg19

Running the script make XXXX ensures that all relevant directories are set up, and also enforces a clean structure to the NGS project.

Within this we make a raw_fastq folder, where we temporarily store all the raw fastq files for each project. This folder acts as an initial stagging area for the raw fastq files. During the project set up, we copy/move project/sample related fastq files to their own specific directories. Fastq files must have suffix and be gzipped: _1.fq.gz or _2.fq.gz
furture version will allow any format

Running ngseasy with the relevent configuration file, will set up the following directory structure for every project and sample within a project:-

.
ngs_projects  
|  
|__raw_fastq  
|__config_files  
|__run_logs
|__ngseasy_resources 
|
|__ project_id  
    |  
    |__run_logs  
    |__config_files  
    |
    |__sample_id_1  
    |   |  
    |   |__fastq  
    |   |__tmp  
    |   |__alignments  
    |   |__vcf  
    |   |__reports  
    |   |__config_files  
    |
    |
    |__sample_id_n  
        |  
        |__fastq  
        |__tmp  
        |__alignments  
        |__vcf  
        |__reports  
        |__config_files

The raw_fastq Directory

The raw_fastq Directory is a very special directory indeed. This is where the user should copy and or move ALL NEW RAW FASTQ Files to. This is to be used as an intial staging area for all fastq files. NGSeasy expects all raw fastq data to be placed here for all new samples or runs. NGSeasy inspects this folder and looks for the fastq file names specified in your confifg file. If NGSeasy doen't find them, then it exits. We do this to force the user to get organised.

Manually Build required NGSeasy Container Images

Work In Progress...

Currently we are not able to automatically build some of the tools in pre-built docker containers due to licensing restrictions.

Some of the software has restrictions on use particularly for commercial purposes. Therefore if you wish to use this for commercial purposes, then you leagally have to approach the owners of the various components yourself!

Software composing the pipeline requiring registration:-

novoalign http://www.novocraft.com/
GATK https://www.broadinstitute.org/gatk/
ANNOVAR http://www.openbioinformatics.org/annovar/

These tools require manual download and registration with the proivder. For non-academics/commercial groups, you will need to pay for some of these tools.

Once you have paid/registered and downloaded the tool, we provide scripts and guidance for building these tools on your system.

Its as easy as:-

docker build -t compbio/ngseasy-${TOOL} .

Building NOVOALIGN

Download Novoalign from http://www.novocraft.com/ into the local build directory *ngseasy/containerized/ngs_docker_debian/ngs_aligners/ngseasy_novoalign. Edit the Dockerfile to relfect the correct version of novoalign.

To use all novoalign fucntionality, you will need to pay for a license.

Once you obtained your novoalign.lic, download this to the build directory *ngseasy/containerized/ngs_docker_debian/ngs_aligners/ngseasy_novoalign, which now should contain your updated Dockerfile.

# move to ngseasy_stampy folder
cd ngseasy/containerized/ngs_docker_debian/ngs_aligners/ngseasy_novoalign
ls

the directory should contain the following:-

Dockerfile
novoalign.lic
README.md
novosortV1.03.01.Linux3.0.tar.gz
novocraftV3.02.08.Linux3.0.tar.gz

build novoalign

# build
docker build -t compbio/ngseasy-novoalign:v1.0 .

Building GATK

You need to register and accept the GATK license agreement at https://www.broadinstitute.org/gatk/.

Once done, download GATK and place in the GTAK build directory ngseasy/containerized/ngs_docker_debian/ngs_utils/ngseasy_gatk.

Edit the Dockerfile to relfect the correct version of GATK.

# move to ngseasy_gatk folder
cd ngseasy/containerized/ngs_docker_debian/ngs_utils/ngseasy_gatk
ls

the directory should contain the following:-

Dockerfile
README.md
GenomeAnalysisTK-3.3-0.tar.bz2

build gatk

# build
docker build -t compbio/ngseasy-gatk:v1.0 .

Manually Build NGSeasy Variant Annotaion Container Images

The tools used for variant annotation use large databases and the docker images exceed 10GB. Therefore, the user should manually build these container images prior to running the NGS pipelines. Docker build files (Dockerfile) are available for - Annovar
- VEP - snpEff

Note Annovar requires user registration.

Once built on the user system, these container images can persist for as long as the user wants.

Large Variant Annotation Container Images

Its as easy as:-

docker build -t compbio/ngseasy-${TOOL} .

Build VEP


cd /media/ngs_projects/nsgeasy/ngs/containerized/ngs_docker_debian/ngseasy_vep

sudo docker build -t compbio/ngseasy-vep:${VERSION} .

Build Annovar

cd /media/ngs_projects/nsgeasy/ngs/containerized/ngs_docker_debian/ngseasy_annovar

sudo docker build -t compbio/ngseasy-annovar:${VERSION} .

Build snpEff

cd /media/ngs_projects/nsgeasy/ngs/containerized/ngs_docker_debian/ngseasy_snpeff

sudo docker build -t compbio/ngseasy-snpeff:${VERSION} .

Coming Soon

New Aligners:- SNAP, GSNAP, mr- and mrs-Fast,gem
https://github.com/amplab/snap
[SLOPE (CNV fo targetted NSG)] ((http://www.biomedcentral.com/1471-2164/12/184))
Cancer Pipelines
Annotation Pipelines and Databases
Visualisation Pipelines
Var Callers:- VarScan2
SGE scripts and basic BASH scrips for running outside of Docker
biobambam https://github.com/gt1/biobambam
bamaddrg https://github.com/ekg/bamaddrg
bamtools https://github.com/ekg/bamtools

Useful Links

https://bcbio.wordpress.com/
https://basecallbio.wordpress.com/2013/04/23/base-quality-score-rebinning/
https://github.com/statgen/bamUtil
http://genome.sph.umich.edu/wiki/BamUtil:_recab
https://github.com/chapmanb/bcbio.variation
http://plagnol-lab.blogspot.co.uk/2013/11/faq-and-clarifications-for-exomedepth-r.html