展开

Coursera基因组课程

最后发布时间 : 2023-08-01 21:14:41 浏览量 :

基因组数据科学 专项课程
bilibili

基因组技术基础

This course introduces you to the basic biology of modern genomics and the experimental tools that we use to measure it. We'll introduce the Central Dogma of Molecular Biology and cover how next-generation sequencing can be used to measure DNA, RNA, and epigenetic patterns. You'll also get an introduction to the key concepts in computing and data science that you'll need to understand how data from next-generation sequencing experiments are generated and analyzed.
This is the first course in the Genomic Data Science Specialization.

Overview

In this Module, you can expect to study topics of "Just enough molecular biology", "The genome", "Writing a DNA sequence", "Central dogma", "Transcription", "Translation", and "DNA structure and modifications".

  • Why Genomics?12分钟
  • What Is Genomics?7分钟
  • What Is Genomic Data Science?8分钟
  • Just Enough Cell Biology8分钟
  • Important Molecules in Molecular Biology7分钟
  • The Human Genome Project17分钟
  • Molecular Biology Structures9分钟
  • From Genes to Phenotypes9分钟

Measurement Technology

In this module, you'll learn about polymerase chain reaction, next generation sequencing, and applications of sequencing.

  • Polymerase Chain Reaction8分钟
  • Next Generation Sequencing7分钟
  • Applications of Sequencing8分钟

Computing Technology

The lectures for this module cover a few basic topics in computing technology. We'll go over the foundations of computer science, algorithms, memory and data structures, efficiency, software engineering, and computational biology software.

  • What Is Computer Science?5分钟
  • Algorithms4分钟
  • Memory and Data Structures6分钟
  • Efficiency3分钟
  • Software Engineering8分钟
  • What is Computational Biology Software11分钟

Data Science Technology

In this module on Data Science Technology, we'll be covering quite a lot of information about how to handle the data produced during the sequencing process. We'll cover reproducibility, analysis, statistics, question types, the central dogma of inference, analysis code, testing, prediction, variation, experimental design, confounding, power, sample size, correlation, causation, and degrees of freedom.

  • Why Care About Statistics?4分钟
  • What Went Wrong?4分钟
  • The Central Dogma of Statistics3分钟
  • Data Sharing Plans4分钟
  • Getting Help with Statistics2分钟
  • Plotting Your Data5分钟
  • Sample Size and Variability7分钟
  • Statistical Significance6分钟
  • Multiple Testing7分钟
  • Study Design, Batch Effects, and Confounding7分钟

基因组数据科学中的 Python 应用

This class provides an introduction to the Python programming language and the iPython notebook. This is the third course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Week One

This week we will have an overview of Python and take the first steps towards programming.

  • Lecture 1: Overview of Python12分钟
  • Lecture 2.1 - First Steps Toward Programming Part 110分钟
  • Lecture 2.2 - First Steps Toward Programming Part 215分钟
  • Lecture 2.3 - First Steps Toward Programming Part 3 (8:57)8分钟
  • Lecture 2.4 - First Steps Toward Programming Part 4 (9:58)9分钟

Week Two

In this module, we'll be taking a look at Data Structures and Ifs and Loops.

  • Lecture 3.1: Data Structures Part 1 (11:58)11分钟
  • Lecture 3.2: Data Structures Part 2 (10:41)10分钟
  • Lecture 4.1: Ifs and Loops Part 1 (11:26)11分钟
  • Lecture 4.2: Ifs and Loops Part 2 (15:28)15分钟

Week Three

In this module, we have a long three-part lecture on Functions as well as a 10-minute look at Modules and Packages.

  • Lecture 5.1: Functions Part 1 (5:54)5分钟
  • Lecture 5.2: Functions Part 2 (8:20)8分钟
  • Lecture 5.3: Functions Part 3 (13:24)13分钟
  • Lecture 6: Modules and Packages (10:32)10分钟

Week Four

In this module, we have another long three-part lecture, this time about Communicating with the Outside, as well as a final lecture about Biopython.

  • Lecture 7.1: Communicating with the Outside Part 1 (6:41)6分钟
  • Lecture 7.2: Communicating with the Outside Part 2 (7:38)7分钟
  • Lecture 7.3: Communicating with the Outside Part 3 (17:42)17分钟
  • Lecture 8: Biopython (13:32)13分钟

DNA测序算法

We will learn computational methods -- algorithms and data structures -- for analyzing DNA sequencing data. We will learn a little about DNA, genomics, and how DNA sequencing is used. We will use Python to implement key algorithms and data structures and to analyze real genomes and DNA sequencing datasets.

DNA sequencing, strings and matching

This module we begin our exploration of algorithms for analyzing DNA sequencing data. We'll discuss DNA sequencing technology, its past and present, and how it works.

  • Module 1 Introduction1分钟
  • Lecture: Why study this?4分钟
  • Lecture: DNA sequencing past and present3分钟
  • Lecture: Genomes as strings, reads as substrings5分钟
  • Lecture: String definitions and Python examples3分钟
  • Practical: String basics 7分钟
  • Practical: Manipulating DNA strings 7分钟
  • Practical: Downloading and parsing a genome 6分钟
  • Lecture: How DNA gets copied3分钟
  • Optional lecture: How second-generation sequencers work 7分钟
  • Optional lecture: Sequencing errors and base qualities 6分钟
  • Lecture: Sequencing reads in FASTQ format4分钟
  • Practical: Working with sequencing reads 11分钟
  • Practical: Analyzing reads by position 6分钟
  • Lecture: Sequencers give pieces to genomic puzzles5分钟
  • Lecture: Read alignment and why it's hard3分钟
  • Lecture: Naive exact matching10分钟
  • Practical: Matching artificial reads 6分钟
  • Practical: Matching real reads 7分钟

Preprocessing, indexing and approximate matching

In this module, we learn useful and flexible new algorithms for solving the exact and approximate matching problems. We'll start by learning Boyer-Moore, a fast and very widely used algorithm for exact matching

  • Week 2 Introduction 1分钟
  • Lecture: Boyer-Moore basics8分钟
  • Lecture: Boyer-Moore: putting it all together6分钟
  • Lecture: Diversion: Repetitive elements5分钟
  • Practical: Implementing Boyer-Moore 10分钟
  • Lecture: Preprocessing7分钟
  • Lecture: Indexing and the k-mer index10分钟
  • Lecture: Ordered structures for indexing8分钟
  • Lecture: Hash tables for indexing7分钟
  • Practical: Implementing a k-mer index 7分钟
  • Lecture: Variations on k-mer indexes9分钟
  • Lecture: Genome indexes used in research9分钟
  • Lecture: Approximate matching, Hamming and edit distance6分钟
  • Lecture: Pigeonhole principle6分钟
  • Practical: Implementing the pigeonhole principle 9分钟

Edit distance, assembly, overlaps

This week we finish our discussion of read alignment by learning about algorithms that solve both the edit distance problem and related biosequence analysis problems, like global and local alignment.

  • Module 3 Introduction 1分钟
  • Lecture: Solving the edit distance problem12分钟
  • Lecture: Using dynamic programming for edit distance12分钟
  • Practical: Implementing dynamic programming for edit distance 6分钟
  • Lecture: A new solution to approximate matching9分钟
  • Lecture: Meet the family: global and local alignment10分钟
  • Practical: Implementing global alignment 8分钟
  • Lecture: Read alignment in the field4分钟
  • Lecture: Assembly: working from scratch2分钟
  • Lecture: First and second laws of assembly8分钟
  • Lecture: Overlap graphs8分钟
  • Practical: Overlaps between pairs of reads 4分钟
  • Practical: Finding and representing all overlaps 3分钟

Algorithms for assembly

In the last module we began our discussion of the assembly problem and we saw a couple basic principles behind it. In this module, we'll learn a few ways to solve the alignment problem.

  • Lecture: The shortest common superstring problem8分钟
  • Practical: Implementing shortest common superstring 4分钟
  • Lecture: Greedy shortest common superstring7分钟
  • Practical: Implementing greedy shortest common superstring 7分钟
  • Lecture: Third law of assembly: repeats are bad5分钟
  • Lecture: De Bruijn graphs and Eulerian walks8分钟
  • Practical: Building a De Bruijn graph 4分钟
  • Lecture: When Eulerian walks go wrong9分钟
  • Lecture: Assemblers in practice8分钟
  • Lecture: The future is long?9分钟
  • Lecture: Computer science and life science5分钟
  • Lecture: Thank yous 43

用于基因组数据科学的命令行工具

Introduces to the commands that you need to manage and analyze directories, files, and large sets of genomic data. This is the fourth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Basic Unix Commands

In this module, you will be introduced to command Line Tools for Genomic Data Science

  • Basic Unix Commands 1: Content Representation3分钟
  • Basic Unix Commands 2: Files, Directories, Paths7分钟
  • Basic Unix Commands 3: File Naming4分钟
  • Basic Unix Commands 4: Content Creation9分钟
  • Basic Unix Commands 5: Accessing Content I6分钟
  • Basic Unix Commands 6: Accessing Content II4分钟
  • Basic Unix Commands 7: Redirecting Content6分钟
  • Basic Unix Commands 8: Querying Content15分钟
  • Basic Unix Commands 9: Comparing Content11分钟
  • Basic Unix Commands 10: Archiving Content13分钟
  • Basic Unix Commands 11: Practical Exercises I13分钟
  • Basic Unix Commands 12: Practical Exercises II9分钟

Week Two

In this module, we'll be taking a look at Sequences and Genomic Features in a sequence of 10 presentations.

  • Sequences and Genomic Features 1: Molecular Bio Primer6分钟
  • Sequences and Genomic Features 2: Sequence Representation and Generation11分钟
  • Sequences and Genomic Features 3: Annotation14分钟
  • Sequences and Genomic Features 4.1: Alignment I13分钟
  • Sequences and Genomic Features 4.2: Alignment II9分钟
  • Sequences and Genomic Features 5: Recreating Sequences & Features12分钟
  • Sequences and Genomic Features 6: Genomic Feature Retrieval5分钟
  • Sequences and Genomic Features 7: SAMtools I11分钟
  • Sequences and Genomic Features 8: SAMtools II9分钟
  • Sequences and Genomic Features 9: BEDtools I15分钟
  • Sequences and Genomic Features 10: BEDtools II15分钟

Week Three

In this module, we'll be going over Alignment and Sequence Variation in another sequence of 8 presentations.

  • Alignment & Sequence Variation 1: Overview4分钟
  • Alignment & Sequence Variation 2: Alignment & Variant Detection Tools5分钟
  • Alignment & Sequence Variation 3: VCF11分钟
  • Alignment & Sequence Variation 4: Bowtie9分钟
  • Alignment & Sequence Variation 5: BWA 4分钟
  • Alignment & Sequence Variation 6: SAMtools (mpileup)6分钟
  • Alignment & Sequence Variation 7: BCFtools8分钟
  • Alignment & Sequence Variation 8: Variant Calling5分钟

Week Four

In this module, we'll be going over Tools for Transcriptomics in a sequence of 6 presentations.

  • Tools for Transcriptomics 1: Overview6分钟
  • Tools for Transcriptomics 2: RNA-seq7分钟
  • Tools for Transcriptomics 3.1: Tophat I9分钟
  • Tools for Transcriptomics 3.2: Tophat II 8分钟
  • Tools for Transcriptomics 4: Cufflinks10分钟
  • Tools for Transcriptomics 5: Cuffdiff16分钟
  • Tools for Transcriptomics 6.1: Integrated Genomics Viewer I8分钟
  • Tools for Transcriptomics 6.2: Integrated Genomics Viewer II 6分钟

使用Bioconductor分析基因组科学数据

Learn to use tools from the Bioconductor project to perform analysis of genomic data. This is the fifth course in the Genomic Big Data Specialization from Johns Hopkins University.

Week One

The class will cover how to install and use Bioconductor software. We will discuss common data structures, including ExpressionSets, SummarizedExperiment and GRanges used across several types of analyses.

  • Installing R on Windows 3分钟
  • Installing R on A Mac 2分钟
  • Installing R Studio on a Mac 1分钟
  • What is Bioconductor7分钟
  • Installing Bioconductor3分钟
  • The Bioconductor Website9分钟
  • Useful Online Resources5分钟
  • R Base Types18分钟
  • GRanges - Overview4分钟
  • IRanges - Basic Usage12分钟
  • GenomicRanges - GRanges8分钟
  • GenomicRanges - Basic GRanges Usage8分钟
  • GenomicRanges - seqinfo6分钟
  • AnnotationHub8分钟
  • Usecase: AnnotationHub and GRanges, part 112分钟
  • Usecase: AnnotationHub and GRanges, part 213分钟

Week Two

In this week we will learn how to represent and compute on biological sequences, both at the whole-genome level and at the level of millions of short reads.

  • Biostrings7分钟
  • BSgenome6分钟
  • Biostrings - Matching6分钟
  • BSgenome - Views9分钟
  • GenomicRanges - Rle12分钟
  • GenomicRanges - Lists6分钟
  • GenomicFeatures18分钟
  • rtracklayer - Data Import14分钟

Week Three

In this week we will cover Basic Data Types, ExpressionSet, biomaRt, and R S4.

  • Basic Data Types4分钟
  • Annotation Overview4分钟
  • ExpressionSet Overview4分钟
  • ExpressionSet9分钟
  • SummarizedExperiment7分钟
  • GEOquery6分钟
  • biomaRt13分钟
  • R S4 Classes16分钟
  • R S4 Methods10分钟

Week Four

In this week, we will cover Getting data in Bioconductor, Rsamtools, oligo, limma, and minfi

  • Getting data into Bioconductor6分钟
  • Short Read4分钟
  • Rsamtools12分钟
  • oligo14分钟
  • limma16分钟
  • minfi11分钟
  • Count-based RNA-seq analysis15分钟

基因组数据科学所需的统计学

An introduction to the statistics behind the most popular genomic data science projects. This is the sixth course in the Genomic Big Data Science Specialization from Johns Hopkins University.

Module 1

This course is structured to hit the key conceptual ideas of normalization, exploratory analysis, linear modeling, testing, and multiple testing that arise over and over in genomic studies.

  • Welcome to Statistics for Genomic Data Science2分钟
  • What is Statistics?2分钟
  • Finding Statistics You Can Trust (4:44)4分钟
  • Getting Help (3:44)3分钟
  • What is Data? (4:28)4分钟
  • Representing Data (5:23)5分钟
  • Module 1 Overview (1:07)1分钟
  • Reproducible Research (3:42)3分钟
  • Achieving Reproducible Research (5:02)5分钟
  • R Markdown (6:26)6分钟
  • The Three Tables in Genomics (2:10)2分钟
  • The Three Tables in Genomics (in R) (3:46)3分钟
  • Experimental Design: Variability, Replication, and Power (14:17)14分钟
  • Experimental Design: Confounding and Randomization (9:26)9分钟
  • Exploratory Analysis (9:21)9分钟
  • Exploratory Analysis in R Part I (7:22)7分钟
  • Exploratory Analysis in R Part II (10:07)10分钟
  • Exploratory Analysis in R Part III (7:26)7分钟
  • Data Transforms (7:31)7分钟
  • Clustering (8:43)8分钟
  • Clustering in R (9:09)9分钟

Module 2

This week we will cover preprocessing, linear modeling, and batch effects.

  • Module 2 Overview (1:12)1分钟
  • Dimension Reduction (12:13)12分钟
  • Dimension Reduction (in R) (8:48)8分钟
  • Pre-processing and Normalization (11:26)11分钟
  • Quantile Normalization (in R) (4:49)4分钟
  • The Linear Model (6:50)6分钟
  • Linear Models with Categorical Covariates (4:08)4分钟
  • Adjusting for Covariates (4:16)4分钟
  • Linear Regression in R (13:03)13分钟
  • Many Regressions at Once (3:50)3分钟
  • Many Regressions in R (7:21)7分钟
  • Batch Effects and Confounders (7:11)7分钟
  • Batch Effects in R: Part A (8:18)8分钟
  • Batch Effects in R: Part B (3:50)3分钟

Module 3

This week we will cover modeling non-continuous outcomes (like binary or count data), hypothesis testing, and multiple hypothesis testing.

  • Module 3 Overview (1:07)1分钟
  • Logistic Regression (7:03)7分钟
  • Regression for Counts (5:02)5分钟
  • GLMs in R (9:28)9分钟
  • Inference (4:18)4分钟
  • Null and Alternative Hypotheses (4:45)4分钟
  • Calculating Statistics (5:11)5分钟
  • Comparing Models (7:08)7分钟
  • Calculating Statistics in R9分钟
  • Permutation (3:26)3分钟
  • Permutation in R (3:33)3分钟
  • P-values (6:04)6分钟
  • Multiple Testing (8:25)8分钟
  • P-values and Multiple Testing in R: Part A (5:58)5分钟
  • P-values and Multiple Testing in R: Part B (4:23)4分钟

Module 4

In this week we will cover a lot of the general pipelines people use to analyze specific data types like RNA-seq, GWAS, ChIP-Seq, and DNA Methylation studies.

  • Module 4 Overview (1:21)1分钟
  • Gene Set Enrichment (4:19)4分钟
  • More Enrichment (3:59)3分钟
  • Gene Set Analysis in R (7:43)7分钟
  • The Process for RNA-seq (3:59)3分钟
  • The Process for Chip-Seq (5:25)5分钟
  • The Process for DNA Methylation (5:03)5分钟
  • The Process for GWAS/WGS (6:12)6分钟
  • Combining Data Types (eQTL) (6:04)6分钟
  • eQTL in R (10:36)10分钟
  • Researcher Degrees of Freedom (5:49)5分钟
  • Inference vs. Prediction (8:52)8分钟
  • Knowing When to Get Help (2:31)2分钟
  • Statistics for Genomic Data Science Wrap-Up (1:53)1分钟