转录因子Chip-seq数据分析
最后发布时间 : 2023-02-07 21:02:24
浏览量 :
Chip-seq中使用泊松分布检验某一个peak是否富集
原假设:reads是随机分布的,且在基因的某个区域服从泊松分布,可以计算出lambda的值
软件与流程
- 原始测序文件质控
- FastQC
- 去除测序adapter序列
- cutadapt
- chip-seq回帖
- bowtie2、bwa mem
- bam文件排序建立索引
· + samtools - bam文件去重
- picard
- Call peak
- MACS2
- Peak区域的注释
- Homer
- Peak区域motif的发现
- MEME、DREME、STREME
测试数据
SAMPLES = [
"CTCF-untreat",
"CTCF-auxin2days"
]
REP_INFO = [
"rep1",
"rep2"
]
rule all:
input:
expand("macs2_result/mESC-ChIPSeq-{case_name}_{rep}_peaks.narrowPeak", case_name=SAMPLES, rep=REP_INFO)
rule call_peak:
input:
Input = "bam.bt2/mESC-ChIPSeq-Input_{rep}_bt2_hg38_sort_rmdup_MAPQ20.bam",
PD = "bam.bt2/mESC-ChIPSeq-{case_name}_{rep}_bt2_hg38_sort_rmdup_MAPQ20.bam"
output:
"macs2_result/mESC-ChIPSeq-{case_name}_{rep}_peaks.narrowPeak"
params:
"mESC-ChIPSeq-{case_name}_{rep}",
"macs2_result"
log:
"macs2_result/mESC-ChIPSeq-{case_name}_{rep}_macs2_callpeak.log"
shell:
"macs2 callpeak -c {input.Input} -t {input.PD} -f BAM -g mm --outdir {params[1]} -n {params[0]} --call-summits -m 5 50 -q 0.005 > {log} 2>&1 "
ChIPseeker: an R package for ChIP peak Annotation, Comparison and Visualization