variant discovery
学习资料
gatk-tutorials
不同类型的基因组变异
- Homozygous(纯合) deletion
- Hemizygous(半合) deletion
- heterozygote(杂合)
- Gain(insertion) 有更多区域代表这个DNA
structural variance can be can include copy number variation. we're looking at read depth as the signal that we want to make that call. so for example in this sample this piece of the reference doesn't exist in your sample in this particular sample. and so because of that we don't see any read coverage over that region at all. and you can also have a hemizygous deletion where we're seeing half the coverage. we would expect and you can have a gain a copy number alteration
GATK Available Programs
Base Calling
Tools that process sequencing machine data, e.g. Illumina base calls, and detect sequencing level attributes, e.g. adapters
Copy Number Variant Discovery
Tools that analyze read coverage to detect copy number variants.
Coverage Analysis
Tools that count coverage, e.g. depth per allele
Diagnostics and Quality Control
Tools that collect sequencing quality related and comparative metrics
Example Tools
Example tools that show developers how to implement new tools
Flow Based Tools
Tools designed specifically to operate on flow based data
Genotyping Arrays Manipulation
Tools that manipulate data generated by Genotyping arrays
Intervals Manipulation
Tools that process genomic intervals in various formats
Metagenomics
Tools that perform metagenomic analysis, e.g. microbial community composition and pathogen detection
Methylation-Specific Tools
Tools that perform methylation calling, processing bisulfite sequenced, methylation-aware aligned BAM
Other
Miscellaneous tools, e.g. those that aid in data streaming
Read Data Manipulation
Tools that manipulate read data in SAM, BAM or CRAM format
Reference
Tools that analyze and manipulate FASTA format references
Short Variant Discovery
Tools that perform variant calling and genotyping for short variants (SNPs, SNVs and Indels)
Structural Variant Discovery
Tools that detect structural variants
Variant Evaluation and Refinement
Tools that evaluate and refine variant calls, e.g. with annotations not offered by the engine
Variant Filtering
Tools that filter variants by annotating the FILTER column
Variant Manipulation
Tools that manipulate variant call format (VCF) data
germline / somatic mutation
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4502642/
we share most of our genome with other humans. The site, most of the sites that are variant in any individual sample, are common variation that are by allelic.so the key feature of these variation is that you want to be able to compare them across many different samples.
拓展
Calling genomic variants
nextflow流程
SNP calling与Genotype calling不同,SNP calling只是确定基因组的位点存在变异,并不涉及对应位点的基因型,Genotype calling在SNP calling的基础上进一步确定变异位点的基因型,包括是纯合还是杂合。
Human genomic variation
3 billion sites in the human genome
Humans share 99.5% DNA with any other human
We share commonly variant sites and most of these are biallelic