Bowtie 2
Memory-efficient tool for aligning sequencing reads to long reference sequences
Bowtie 2 requires an environment module
In order to use Bowtie 2, you must first load the appropriate environment module:
module load gnu
Bowtie 2 is a bioinformatics program designed to align genomic sequence reads of about 50 and up to thousands of characters in length. It is particularly good for aligning such reads to fairly long genomes, such as mammalian genomes.
Using Bowtie 2 on RCC Resources#
Running Bowtie 2 on the HPC#
The following example shows the use of basic Bowtie2 commands. Download the example data files: e_coli_1000.fa and e_coli_1000.fq. Then run the commands:
This should print many lines of output and then quit. When the command completes, the current directory will contain
six new files that all start with e_coli
and end with .1.bt2
, .2.bt2
, .3.bt2
, .4.bt2
, .rev.1.bt2
, and
.rev.2.bt2.
These files constitute the index. To run the Bowtie 2 aligner, which aligns a set of unpaired reads to the
E. coli reference genome using the index generated in the previous step, use the command:
The alignment results in SAM format are written to the file eg1.sam
, and a short alignment summary is written to the
console.
Running Bowtie 2 in Parallel on HPC#
Below is a script to run the above example on the HPC using the Slurm job scheduler. The script must be saved with the
.sh
extension.
Then submit your script using the following command, replacing YOURSCRIPT
with the name of your script file: