gatk Mutect2 \ -R $baseDir/ref/Homo_sapiens_assembly38.fasta \ -I $baseDir/bams/tumor.bam \ -I $baseDir/bams/normal.bam \ -tumor HCC1143_tumor \ -normal HCC1143_normal \ -pon $baseDir/resources/chr17_m2pon.vcf.gz \ --af-of-alleles-not-in-resource 0.0000025 \ --germline-resource $baseDir/resources/chr17_af-only-gnomad_grch38.vcf.gz \ -L $baseDir/resources/chr17plus.interval_list \ -O $baseDir/sandbox/1_somatic_m2.vcf.gz \ -bamout $baseDir/sandbox/2_tumor_normal_m2.bam
--input,-I
--output,-O
--reference,-R
--tumor-sample,-tumor
--normal-sample,-normal
--panel-of-normals,-pon
--germline-resource
--intervals,-L
--bam-output,-bamout
Call somatic mutations using GATK4 Mutect2通过--germline-resource指定population germline变异的注释。population germline必须包含allele-specific frequencies。必须包含AF的注释在vcf文件的INFO列。Mutect2使用population allele frequencies注释等位基因的变异。当使用population germline时,考虑将参数--af-of-alleles-not-in-resource从默认值0.001进行调整。例如,gnomAD的文件af-only-gnomad_grch38.vcf.gz代表represents ~200k exomes 和 ~16k genomes,上述教程使用的数据/chr17_af-only-gnomad_grch38.vcf.gz是外显子数据,因此我们调整--af-of-alleles-not-in-resource为0.0000025,在对应于1/(2exome samples)=1/(2200,000)。默认的0.001适用于没有任何population resource的人类样本分析,它是基于人类平均的杂合率。population allele frequencies(POP_AF)和af-of-alleles-not-in-resourcefactor在体细胞变异的概率计算中。
--af-of-alleles-not-in-resource
af-only-gnomad_grch38.vcf.gz
/chr17_af-only-gnomad_grch38.vcf.gz
POP_AF
af-of-alleles-not-in-resource
MuTect2 reassembly recovers the 120 base deletion haplotype
Somatic calls inferred from PairHMM likelihoods
= 5.3 是有利于体细胞变异的基因型
Multiallelic calling in GATK4 Mutect2
If you don't have any normals. You can still run the pipeline's but you might get lots of false positives.if you ware just trying to pop the filter out common variants you could use something like panel normal. The panel normals is really helpful for removing sequencing.
https://gatk.broadinstitute.org/hc/en-us/articles/360035894731-Somatic-short-variant-discovery-SNVs-Indels-