sss
test
test
test

infer_experiment.py

对于双端RNA-seq,有两种不同的strand reads 方式(such as Illumina ScriptSeq protocol)

对于单端RNA-seq,还有两种不同的链读方式:

gtf转bed文件

Pair-end non strand specific

infer_experiment.py -r hg19.refseq.bed12 -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam
This is PairEnd Data
Fraction of reads failed to determine: 0.0172
Fraction of reads explained by "1++,1--,2+-,2-+": 0.4903
Fraction of reads explained by "1+-,1-+,2++,2--": 0.4925

1.72%的reads映射到两个位置(基因组区域正链和负链都有基因);剩余 98.28% (1 - 0.0172 = 0.9828) 的reads,一半可以用1++,1–,2+-,2-+解释,一半可以用1+-,1-+,2++,2–解释。最终得出结论,这不是一个链特异性的数据集,因为strandness of reads独立于standness of transcripts

Pair-end strand specific

infer_experiment.py -r hg19.refseq.bed12 -i Pairend_StrandSpecific_51mer_Human_hg19.bam
This is PairEnd Data
Fraction of reads failed to determine: 0.0072
Fraction of reads explained by "1++,1--,2+-,2-+": 0.9441
Fraction of reads explained by "1+-,1-+,2++,2--": 0.0487

0.72%的reads映射到两个位置(基因组区域正链和负链都有基因);剩余 99.28% (1 - 0.0072 = 0.9928)的reads,绝大多数可以用1++,1–,2+-,2-+解释。最终得出结论,这不是一个链特异性的数据集,因为strandness of reads独立于standness of transcripts,因此表明是链特异性的数据集。

Single-end strand specific

infer_experiment.py -r hg19.refseq.bed12 -i SingleEnd_StrandSpecific_36mer_Human_hg19.bam
This is SingleEnd Data
Fraction of reads failed to determine: 0.0170
Fraction of reads explained by "++,--": 0.9669
Fraction of reads explained by "+-,-+": 0.0161