markdown
最后发布时间:2023-07-04 15:11:13
浏览量:
sss
testtest
test
infer_experiment.py
- 该程序用于“猜测”RNA-seq测序是如何配置的, 特别是 链特异性RNA-seq数据的读数是在哪一条链,通过比较strandness of reads和standness of transcripts。
- strandness of reads决定比对;standness of transcripts决定注释
- 对于非链特异性的RNA-seq数据,strandness of reads和standness of transcripts是独立的
- 对于链特异性的RNA-seq数据,strandness of reads在很大程度上取决于standness of transcripts,有关详细信息,请参见以下3个示例。
- 在将读数映射到参考基因组之前,您不需要知道RNA测序方案。使用非链特异性比对RNA-seq数据,这个脚本可以“猜测”RNA-seq是哪一种链特异性。
对于双端RNA-seq,有两种不同的strand reads 方式(such as Illumina ScriptSeq protocol)
- 1++,1–,2+-,2-+
- read1 mapped to ‘+’ strand indicates parental gene on ‘+’ strand
- read1 mapped to ‘-‘ strand indicates parental gene on ‘-‘ strand
- read2 mapped to ‘+’ strand indicates parental gene on ‘-‘ strand
- read2 mapped to ‘-‘ strand indicates parental gene on ‘+’ strand
- 1+-,1-+,2++,2–
- read1 mapped to ‘+’ strand indicates parental gene on ‘-‘ strand
- read1 mapped to ‘-‘ strand indicates parental gene on ‘+’ strand
- read2 mapped to ‘+’ strand indicates parental gene on ‘+’ strand
- read2 mapped to ‘-‘ strand indicates parental gene on ‘-‘ strand
对于单端RNA-seq,还有两种不同的链读方式:
- ++,–
- read mapped to ‘+’ strand indicates parental gene on ‘+’ strand
- read mapped to ‘-‘ strand indicates parental gene on ‘-‘ strand
- +-,-+
- read mapped to ‘+’ strand indicates parental gene on ‘-‘ strand
- read mapped to ‘-‘ strand indicates parental gene on ‘+’ strand
Pair-end non strand specific
infer_experiment.py -r hg19.refseq.bed12 -i Pairend_nonStrandSpecific_36mer_Human_hg19.bam
This is PairEnd Data
Fraction of reads failed to determine: 0.0172
Fraction of reads explained by "1++,1--,2+-,2-+": 0.4903
Fraction of reads explained by "1+-,1-+,2++,2--": 0.4925
1.72%的reads映射到两个位置(基因组区域正链和负链都有基因);剩余 98.28% (1 - 0.0172 = 0.9828) 的reads,一半可以用1++,1–,2+-,2-+
解释,一半可以用1+-,1-+,2++,2–
解释。最终得出结论,这不是一个链特异性的数据集,因为strandness of reads
独立于standness of transcripts
Pair-end strand specific
infer_experiment.py -r hg19.refseq.bed12 -i Pairend_StrandSpecific_51mer_Human_hg19.bam
This is PairEnd Data
Fraction of reads failed to determine: 0.0072
Fraction of reads explained by "1++,1--,2+-,2-+": 0.9441
Fraction of reads explained by "1+-,1-+,2++,2--": 0.0487
0.72%的reads映射到两个位置(基因组区域正链和负链都有基因);剩余 99.28% (1 - 0.0072 = 0.9928)的reads,绝大多数可以用1++,1–,2+-,2-+
解释。最终得出结论,这不是一个链特异性的数据集,因为strandness of reads
独立于standness of transcripts
,因此表明是链特异性的数据集。
Single-end strand specific
infer_experiment.py -r hg19.refseq.bed12 -i SingleEnd_StrandSpecific_36mer_Human_hg19.bam
This is SingleEnd Data
Fraction of reads failed to determine: 0.0170
Fraction of reads explained by "++,--": 0.9669
Fraction of reads explained by "+-,-+": 0.0161