HISAT2
HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome. Based on an extension of BWT for graphs (Sirén et al. 2014), we designed and implemented a graph FM index (GFM), an original approach and its first implementation. In addition to using one global GFM index that represents a population of human genomes, HISAT2 uses a large set of small GFM indexes that collectively cover the whole genome. These small indexes (called local indexes), combined with several alignment strategies, enable rapid and accurate alignment of sequencing reads. This new indexing scheme is called a Hierarchical Graph FM index (HGFM).
论文
- 2015年HISAT: a fast spliced aligner with low memory requirements
- 2016年Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown
- 2019年Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype
- 2021年Rapid and accurate alignment of nucleotide conversion sequencing reads with HISAT-3N
参数详解
--rna-strandness
- Specify strand-specific information: the default is unstranded.
- For single-end reads, use F or R.
- ‘F’ means a read corresponds to a transcript.
- ‘R’ means a read corresponds to the reverse complemented counterpart of a transcript.
- For paired-end reads, use either FR or RF.
- With this option being used, every read alignment will have an XS attribute tag:(每个读取对齐都将有一个XS属性标记)
- ’+’ means a read belongs to a transcript on ‘+’ strand of genome.
- ‘-‘ means a read belongs to a transcript on ‘-‘ strand of genome.
TopHat has a similar option, –library-type option, where fr-firststrand corresponds to R and RF; fr-secondstrand corresponds to F and FR.