展开

bwa

最后发布时间 : 2023-08-06 20:04:10 浏览量 :

学习资料

基本命令

conda  install bwa
bwa index ref.fasta
bwa mem bwa/ref1.fa reads_1.fq 
bwa mem bwa/ref1.fa reads_1.fq reads_2.fq

关键的参数

Scoring options:

  • -L: INT[,INT] penalty for 5'- and 3'-end clipping [5,5]

bwa mem ref.fasta mother_1.fastq mother_2.fastq  -v 4  | less -S
=====> Processing read 'H0164ALXX140820:2:2204:7262:58269'/1 <=====
* fraction of repetitive seeds: 0.000
* Found CHAIN(0): n=5; weight=146       20;20;0,116046184(20:-10004837) 77;77;0,116046184(20:-10004780) 20;20;60,116046254(20:-10004767)        67;67;60,116046254>
* ---> Processing chain(0) <---
** ---> Extending from seed(4) [77;0,116046184] @ 20 <---
*** Right ref:   GCATAGTTTTAGTCACTTTTCCCGTTACTCTGTCCAGCTTTCCAGTAATACTCATAACATTGCCTTGATTTCATATGACCACGCAGTAAAGTGATTACTGCACTTAGAATTTTGTTGCTTTTGCTGTGCTAACAACTTAAAAGTT>
*** Right query: AGTCACTTTTCCCGTTACTCTGTCCAGCTTTCCAGTAATACTCATAACATCGCCTTGATTTCATATGACCCCGC
*** Right extension: prev_score=77; score=126; bandwidth=100; max_off_diagonal_dist=10
*** Added alignment region: [0,151) <=> [116046184,116046345); score=126; {left,right}_bandwidth={100,100}
** Seed(3) [67;60,116046254] is almost contained in an existing alignment [0,151) <=> [116046184,116046345)
** Seed(3) might lead to a different alignment even though it is contained. Extension will be performed.
** ---> Extending from seed(3) [67;60,116046254] @ 20 <---
*** Left ref:   ACGTTTTGATTCGTCTAACACAGAGATCTCATAAAGGGTAGAGTTCAAATCAATAAATGATTAAACCGTTGTAGACTGGATAGAAATTAACACTCTTTTATTTGTTTGTGTATTCGGTTGAGAGTCTTATACCAATATGTATCCAC>
*** Left query: TCGTCTAACACAGAGATCTCATAAAGGGTAGAGTTCAAATCAATAAATGATTAAACCGTT
*** Left extension: prev_score=-1; score=111; bandwidth=100; max_off_diagonal_dist=10
*** Right ref:   TGCCTTGATTTCATATGACCACGCAGTAAAGTGATTACTGCACTTAGAATTTTGTTGCTTTTGCTGTGCTAACAACTTAAAAGTTTAAAATAACTGATGTTCAAAACAGTGAAGATTTCCTTTTATAAACAAGTTGGAAA
*** Right query: CGCCTTGATTTCATATGACCCCGC
*** Right extension: prev_score=111; score=126; bandwidth=100; max_off_diagonal_dist=0
*** Added alignment region: [0,151) <=> [116046184,116046345); score=126; {left,right}_bandwidth={100,100}
** Seed(2) [20;60,116046254] is almost contained in an existing alignment [0,151) <=> [116046184,116046345)
** Seed(2) might lead to a different alignment even though it is contained. Extension will be performed.
** ---> Extending from seed(2) [20;60,116046254] @ 20 <---
*** Left ref:   ACGTTTTGATTCGTCTAACACAGAGATCTCATAAAGGGTAGAGTTCAAATCAATAAATGATTAAACCGTTGTAGACTGGATAGAAATTAACACTCTTTTATTTGTTTGTGTATTCGGTTGAGAGTCTTATACCAATATGTATCCAC>
*** Left query: TCGTCTAACACAGAGATCTCATAAAGGGTAGAGTTCAAATCAATAAATGATTAAACCGTT
*** Left extension: prev_score=-1; score=64; bandwidth=100; max_off_diagonal_dist=10
*** Right ref:   CACTTTTCCCGTTACTCTGTCCAGCTTTCCAGTAATACTCATAACATTGCCTTGATTTCATATGACCACGCAGTAAAGTGATTACTGCACTTAGAATTTTGTTGCTTTTGCTGTGCTAACAACTTAAAAGTTTAAAATAACTGAT>
*** Right query: CACTTTTCCCGTTACTCTGTCCAGCTTTCCAGTAATACTCATAACATCGCCTTGATTTCATATGACCCCGC
*** Right extension: prev_score=64; score=126; bandwidth=100; max_off_diagonal_dist=0
*** Added alignment region: [0,151) <=> [116046184,116046345); score=126; {left,right}_bandwidth={100,100}
** Seed(1) [20;0,116046184] is almost contained in an existing alignment [0,151) <=> [116046184,116046345)
** Seed(0) [19;128,116046322] is almost contained in an existing alignment [0,151) <=> [116046184,116046345)
* 1 chains remain after removing duplicated chains
** 126, [0,151) <=> [116046184,116046345)
=====> Processing read 'H0164ALXX140820:2:2204:7262:58269'/2 <=====
* fraction of repetitive seeds: 0.000
* 0 chains remain after removing duplicated chains
=====> Processing read 'H0164ALXX140820:2:1102:17372:59394'/1 <=====
* fraction of repetitive seeds: 0.000
* Found CHAIN(0): n=3; weight=95        46;46;13,10004739(20:+10004740) 20;20;60,10004796(20:+10004797) 29;29;108,10004844(20:+10004845)
* ---> Processing chain(0) <---
** ---> Extending from seed(2) [46;13,10004739] @ 20 <---
*** Left ref:   AGGTCATTATGAGTATTGTAACGGAACTAAAGTATACTGGTGCGTCATTTCACTAATGACGTGAATCTTAAAACAACGAAAACGACACGATTGTTGAATTTTCAAA
*** Left query: CGGTCATTATGAG
*** Left extension: prev_score=-1; score=54; bandwidth=100; max_off_diagonal_dist=0
*** Right ref:   ATGCAAAACTAAGCAGATTGTGTCTCTAGAGTATTTCCCATCTCAAGTTTAGTTATTTACTAATTTGGCAACATCTGACCTATCTTTAATTGTGAGAAAATAAACAAACACATAAGCCAACTCTCAGAATATGGTTATACATAGG>
*** Right query: GAGCAGATTGTGTCTCTAGAGGATTTCCCATCTCAGGTTTAGTTATTTTCTAATTTGGCAACATCTGACCTATCTTTACTTGTTAGTAAATG
*** Right extension: prev_score=54; score=96; bandwidth=100; max_off_diagonal_dist=10
*** Added alignment region: [0,137) <=> [10004726,10004873); score=96; {left,right}_bandwidth={100,100}
** Seed(1) [29;108,10004844] is almost contained in an existing alignment [0,137) <=> [10004726,10004873)
** Seed(0) [20;60,10004796] is almost contained in an existing alignment [0,137) <=> [10004726,10004873)
* 1 chains remain after removing duplicated chains
** 96, [0,137) <=> [10004726,10004873)
=====> Processing read 'H0164ALXX140820:2:1102:17372:59394'/2 <=====
* fraction of repetitive seeds: 0.000
* Found CHAIN(0): n=1; weight=19        19;19;32,74408771(20:-51642251)
* ---> Processing chain(0) <---
** ---> Extending from seed(0) [19;32,74408771] @ 20 <---
*** Left ref:   AAGGACAGTGTTTTACTTAGGAGAGAACTTGAGAGACGTCACGTGTGTCCGGGGGTTTC
*** Left query: TTCTGAATCGCATAACCTATTAAAGGAGATAC
*** Left extension: prev_score=-1; score=19; bandwidth=100; max_off_diagonal_dist=0
*** Right ref:   TTTATATCTATTTGTGTTTTCTTTTTAGGTAAGAAATAGTAATTTTTATCTGAAAGTAAGAAAATACAGGTTGACACTGGTGTCCTGACTTGTTCTAGATGTGAAGGTGTCATCTGCCGGGGGCAGGCACCCAGGGGTGGGGCGG>
*** Right query: AGCTTTACTAAGAAGAAGCTGTCTAAATGTTGTTATGAAATGCATCCTGAGCGAGAACCGCATTTCATCGTGCCTATTACTCCTNANNNNNGCNNNNNNN
*** Right extension: prev_score=19; score=19; bandwidth=100; max_off_diagonal_dist=0
*** Added alignment region: [32,51) <=> [74408771,74408790); score=19; {left,right}_bandwidth={100,100}
* 1 chains remain after removing duplicated chains
** 19, [32,51) <=> [74408771,74408790)

Citing BWA

If you use the BWA-backtrack algorithm, please cite the following paper:
Li H. and Durbin R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754-1760. [PMID: 19451168]

If you use the BWA-SW algorithm, please cite:
Li H. and Durbin R. (2010) Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 26, 589-595. [PMID: 20080505]

If you use the fastmap component of BWA, please cite:
Li H. (2012) Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinformatics, 28, 1838-1844. [PMID: 22569178]

if you use the BWA-MEM algorithm or the fastmap command, or want to cite the whole BWA package
Li H. (2013) Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. [arXiv:1303.3997v2]