FLASh (Fast Length Adjustment of SHort reads) is an accurate and fast tool to merge paired-end reads that were generated from DNA fragments whose lengths are shorter than twice the length of reads. Merged read pairs result in unpaired longer reads, which are generally more desired in genome assembly and genome analysis processes.
Briefly, the FLASH algorithm considers all possible overlaps at or above a minimum length between the reads in a pair and chooses the overlap that results in the lowest mismatch density (proportion of mismatched bases in the overlapped region). Ties between multiple overlaps are broken by considering quality scores at mismatch sites. When building the merged sequence, FLASH computes a consensus sequence in the overlapped region. More details can be found in the original publication (http://bioinformatics.oxfordjournals.org/content/27/21/2957.full).
- FLASh web site
- Original publication: FLASH: Fast length adjustment of short reads to improve genome assemblies. T. Magoc and S. Salzberg. Bioinformatics 27:21 (2011), 2957-63
To see what versions of FLASh are available type
module avail flash
To see what other modules are needed, what commands are available and how to get additional help type
module help flash
To use FLASh, include a command like this in your batch script to load the FLASh module:
module load flash
Be sure you also load any other modules needed, as listed by the
module help flash command.