图片alt

TopHat

A spliced read mapper for RNA-Seq. 将reads回帖到基因组

参数选项

step 1 建立bowtie2的索引文件

apt-cache search bowtie2
apt-get install bowtie2
bowtie2-build Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa  genome

bowtie官方传送门

step2 使用TopHat将reads回帖到基因组

wget http://ccb.jhu.edu/software/tophat/downloads/tophat-2.1.1.Linux_x86_64.tar.gz # 常规linux软件安装
tophat  -p 16 -G ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf  -o EV_3  ../data/reference/genome ../data/RAN_seq/SRR1916152.fastq

TopHat官方传送门

图片alt

图片alt

图片alt

图片alt

Transcript assembly, differential expression, and differential regulation for RNA-seq. 拼接转录本、计算表达量、计算差异表达

参数选项

step3 使用Cufflinks拼接转录本

cufflinks -p 8 -u -o EV_3_count EV_3/accepted_hits.bam

图片alt

图片alt

cufflinks官方传送门

Cuffmerge

merge together several Cufflinks assembiles. 将所有转录本数据融合为一个转录本集合

参数选项

step4 使用CUffmerge融合转录本

cuffmerge -g ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf -s ../data/reference/Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa -o cuffmerge_out assemblies.txt

图片alt

图片alt

图片alt

图片alt

Cuffdiff

参数选项

step4 使用Cuffdiff计算多个样本的表达量

cuffdiff -o diff_out -p 8 -L C1,C2 -u <cuffmerge产生的gtf文件> <处理1:tophat产生的bam文件> <处理2:tophat产生的bam文件>

CummeRbund

analyzing Cufflinks RNA-Seq output.

library(cummeRbund)
setwd("/root/yeast_data/analy/")
cuff_data <- readCufflinks('diff_out',reBuild=T)

密度分布图

不同转录本表达水平的密度分布

csDensity(genes(cuff_data))

图片alt

图片alt

4

散点图

两种条件下转录本表达水平的情况

csScatter(genes(cuff_data),"EV","DNMT3B")

图片alt

图片alt

火山图

火山图表示,不同条件下基因表达是否有显著性差异,
横坐标不同条件下基因表达水平的对数值
纵坐标为p检验中P值的负对数值

csVolcano(genes(cuff_data),"EV","DNMT3B", showSignificant=T,alpha=0.05)

图片alt

图片alt

柱形图

myGene <- getGene(cuff_data,'ETS1-1')
expressionBarplot(myGene)
expressionBarplot(isoforms(myGene))

目录说明

.
├── RAN_seq
│   ├── SRR1916152
│   ├── SRR1916152.fastq
│   ├── SRR1916152_fastqc.html
│   ├── SRR1916152_fastqc.zip
│   ├── SRR1916153
│   ├── SRR1916153.fastq
│   ├── SRR1916153_fastqc.html
│   ├── SRR1916153_fastqc.zip
│   ├── SRR1916154
│   ├── SRR1916154.fastq
│   ├── SRR1916154_fastqc.html
│   ├── SRR1916154_fastqc.zip
│   ├── SRR1916155
│   ├── SRR1916155.fastq
│   ├── SRR1916155_fastqc.html
│   ├── SRR1916155_fastqc.zip
│   ├── SRR1916156
│   ├── SRR1916156.fastq
│   ├── SRR1916156_fastqc.html
│   ├── SRR1916156_fastqc.zip
│   ├── SraAccList.txt
│   ├── SraRunInfo.csv
│   └── cmd2_sra2fastq.sh
├── raw_data
│   ├── SRR1916152.sra
│   ├── SRR1916153.sra
│   ├── SRR1916154.sra
│   ├── SRR1916155.sra
│   ├── SRR1916156.sra
│   ├── cmd1_download_sra.sh
│   ├── cmd2_sra2fastq.sh
└── reference
    ├── Saccharomyces_cerevisiae.R64-1-1.48.gtf
    ├── Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa
    ├── genome.1.bt2
    ├── genome.2.bt2
    ├── genome.3.bt2
    ├── genome.4.bt2
    ├── genome.rev.1.bt2
    └── genome.rev.2.bt2
.
├── DNMT3B_2
│   ├── accepted_hits.bam
│   ├── align_summary.txt
│   ├── deletions.bed
│   ├── insertions.bed
│   ├── junctions.bed
│   ├── logs
│   ├── prep_reads.info
│   └── unmapped.bam
├── DNMT3B_2_count
│   ├── genes.fpkm_tracking
│   ├── isoforms.fpkm_tracking
│   ├── skipped.gtf
│   └── transcripts.gtf
├── DNMT3B_3
│   ├── accepted_hits.bam
│   ├── align_summary.txt
│   ├── deletions.bed
│   ├── insertions.bed
│   ├── junctions.bed
│   ├── logs
│   ├── prep_reads.info
│   └── unmapped.bam
├── DNMT3B_3_count
│   ├── genes.fpkm_tracking
│   ├── isoforms.fpkm_tracking
│   ├── skipped.gtf
│   └── transcripts.gtf
├── DNMT3B_4
│   ├── accepted_hits.bam
│   ├── align_summary.txt
│   ├── deletions.bed
│   ├── insertions.bed
│   ├── junctions.bed
│   ├── logs
│   ├── prep_reads.info
│   └── unmapped.bam
├── DNMT3B_4_count
│   ├── genes.fpkm_tracking
│   ├── isoforms.fpkm_tracking
│   ├── skipped.gtf
│   └── transcripts.gtf
├── EV_3
│   ├── accepted_hits.bam
│   ├── align_summary.txt
│   ├── deletions.bed
│   ├── insertions.bed
│   ├── junctions.bed
│   ├── logs
│   ├── prep_reads.info
│   └── unmapped.bam
├── EV_3_count
│   ├── genes.fpkm_tracking
│   ├── isoforms.fpkm_tracking
│   ├── skipped.gtf
│   └── transcripts.gtf
├── EV_4
│   ├── accepted_hits.bam
│   ├── align_summary.txt
│   ├── deletions.bed
│   ├── insertions.bed
│   ├── junctions.bed
│   ├── logs
│   ├── prep_reads.info
│   └── unmapped.bam
├── EV_4_count
│   ├── genes.fpkm_tracking
│   ├── isoforms.fpkm_tracking
│   ├── skipped.gtf
│   └── transcripts.gtf
├── assemblies.txt
├── cmd_cuffdiff.sh
├── cmd_cufflink.sh
├── cmd_cuffmerge.sh
├── cmd_tophat.sh
├── cuffdiff.log
├── cuffmerge_out
│   ├── logs
│   └── merged.gtf
└── diff_out
    ├── bias_params.info
    ├── cds.count_tracking
    ├── cds.diff
    ├── cds.fpkm_tracking
    ├── cds.read_group_tracking
    ├── cds_exp.diff
    ├── gene_exp.diff
    ├── genes.count_tracking
    ├── genes.fpkm_tracking
    ├── genes.read_group_tracking
    ├── isoform_exp.diff
    ├── isoforms.count_tracking
    ├── isoforms.fpkm_tracking
    ├── isoforms.read_group_tracking
    ├── promoters.diff
    ├── read_groups.info
    ├── run.info
    ├── splicing.diff
    ├── tss_group_exp.diff
    ├── tss_groups.count_tracking
    ├── tss_groups.fpkm_tracking
    ├── tss_groups.read_group_tracking
    └── var_model.info

以shell脚本运行

tophat  -p 16 -G ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf  -o EV_3  ../data/reference/genome ../data/RAN_seq/SRR1916152.fastq
tophat  -p 16 -G ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf  -o EV_4  ../data/reference/genome ../data/RAN_seq/SRR1916153.fastq
tophat  -p 16 -G ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf  -o DNMT3B_2  ../data/reference/genome ../data/RAN_seq/SRR1916154.fastq
tophat  -p 16 -G ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf  -o DNMT3B_3  ../data/reference/genome ../data/RAN_seq/SRR1916155.fastq
tophat  -p 16 -G ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf  -o DNMT3B_4  ../data/reference/genome ../data/RAN_seq/SRR1916156.fastq
cufflinks -p 8 -u -o EV_3_count EV_3/accepted_hits.bam &
cufflinks -p 8 -u -o EV_4_count EV_4/accepted_hits.bam &
cufflinks -p 8 -u -o DNMT3B_2_count DNMT3B_2/accepted_hits.bam &
cufflinks -p 8 -u -o DNMT3B_3_count DNMT3B_3/accepted_hits.bam &
cufflinks -p 8 -u -o DNMT3B_4_count DNMT3B_4/accepted_hits.bam &
cuffmerge -g ../data/reference/Saccharomyces_cerevisiae.R64-1-1.48.gtf -s ../data/reference/Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa -o cuffmerge_out assemblies.txt
cuffdiff -o diff_out -p 8 -L EV,DNMT3B -u cuffmerge_out/merged.gtf ./EV_3/accepted_hits.bam,EV_4/accepted_hits.bam ./DNMT3B_2/accepted_hits.bam,./DNMT3B_3/accepted_hits.bam,./DNMT3B_4/accepted_hits.bam
nohup ./cmd_tophat.sh && ./cmd_cufflink.sh && ./cmd_cuffmerge.sh && ./cmd_cuffdiff.sh >RANSeq.log 2>&1 &