展开

转录因子Chip-seq数据分析

最后发布时间 : 2023-02-07 21:02:24 浏览量 :

图片alt

图片alt


生信小木屋

Chip-seq中使用泊松分布检验某一个peak是否富集
原假设:reads是随机分布的,且在基因的某个区域服从泊松分布,可以计算出lambda的值

生信小木屋

软件与流程

  • 原始测序文件质控
    • FastQC
  • 去除测序adapter序列
    • cutadapt
  • chip-seq回帖
    • bowtie2、bwa mem
  • bam文件排序建立索引
    · + samtools
  • bam文件去重
    • picard
  • Call peak
    • MACS2
  • Peak区域的注释
    • Homer
  • Peak区域motif的发现
    • MEME、DREME、STREME

测试数据

SAMPLES = [
    "CTCF-untreat",
    "CTCF-auxin2days"
]

REP_INFO = [
    "rep1",
    "rep2"
]


rule all:
    input:
        expand("macs2_result/mESC-ChIPSeq-{case_name}_{rep}_peaks.narrowPeak", case_name=SAMPLES, rep=REP_INFO)
    

rule call_peak:
    input:
        Input = "bam.bt2/mESC-ChIPSeq-Input_{rep}_bt2_hg38_sort_rmdup_MAPQ20.bam",
        PD = "bam.bt2/mESC-ChIPSeq-{case_name}_{rep}_bt2_hg38_sort_rmdup_MAPQ20.bam"
    output:
        "macs2_result/mESC-ChIPSeq-{case_name}_{rep}_peaks.narrowPeak"
    params:
        "mESC-ChIPSeq-{case_name}_{rep}",
        "macs2_result"
    log:
        "macs2_result/mESC-ChIPSeq-{case_name}_{rep}_macs2_callpeak.log"
    shell:
        "macs2 callpeak -c {input.Input} -t {input.PD} -f BAM -g mm --outdir {params[1]} -n {params[0]} --call-summits -m 5 50 -q 0.005 > {log}  2>&1 "

ChIPseeker: an R package for ChIP peak Annotation, Comparison and Visualization