SNPs and Human Diseases XV Robert Kraaij Department of Internal - PowerPoint PPT Presentation
NGS technologies DepthOfCoverage SNPs and Human Diseases XV Robert Kraaij Department of Internal Medicine r.kraaij@erasmusmc.nl What will NGS bring us? RFLP TaqMan Array Array and Imputation Regional Sequencing Full Genome Sequencing
NGS technologies DepthOfCoverage SNPs and Human Diseases XV Robert Kraaij Department of Internal Medicine r.kraaij@erasmusmc.nl
What will NGS bring us? RFLP TaqMan Array Array and Imputation Regional Sequencing Full Genome Sequencing
• First Generation: a bit of history • Next (Second) Generation • Third Generation
1977: Maxam & Gilbert Sequencing Walter Gilbert from wikipedia.org
Maxam & Gilbert Sequencing G G+A C+T C
1977: Sanger Sequencing Frederick Sanger from wikipedia.org
Sanger Sequencing G A T C
Sanger sequencing landmarks • 1977 bacteriophage φ X174 5.4 kb • 1984 Epstein-Barr virus 170 kb • 1995 Haemophilus influenzae 1.8 Mb • 2001 Human 3 Gb from wikipedia.org
The Human Genome Project Bill Clinton Tony Blair Craig Venter Francis Collins June 26 th , 2000 : working draft, 95% gesequenced April 14 th , 2003 : finished, 99% gesequenced Costs : $ 2.7 billion (instead of $ 3 billion) Timing : 1990 - 2003 (instead of 2005)
• First Generation: a bit of history • Next (Second) Generation • Third Generation
Next Generation: Illumina
Sequencing Workflow Library Data DNA preparation Sequencing analysis isolation
Sequencing Workflow Library Data DNA preparation Sequencing analysis isolation
Sequencing Workflow Library Data DNA preparation Sequencing analysis isolation
Illumina sequencing • fragment DNA • clonal amplification on flowcell by bridgePCR • sequencing-by-synthesis
Bridge amplification
Illumina sequencing • fragment DNA • clonal amplification on flowcell by bridgePCR • sequencing-by-synthesis
Sequencing by synthesis
Sequencing by synthesis
Per Cycle Imaging
Per Cycle Imaging G A T C
Per Cycle Base Calling G G good quality poor quality
Quality Scoring Phred Score Incorrect base Accuracy 1 in 10 90 % 10 20 1 in 100 99 % 1 in 1000 99.9 % 30 1 in 10000 99.99 % 40 50 1 in 100000 99.999 % 0 to 93 ASCII 33 to 126 = single character
FASTQ File @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTC +SEQ_ID !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>
Alignment or Mapping of Reads R E F E R E N C E G E N O M E (HG19) G A T T A C G G T A C T T G C A T A G C T T A C G G T A C T T G C A T A chromosome + position + strand sample.bam
Run QC and filtering sample.bam
sortedBAM file • both reads • quality scores • chromosome • position • quality flag • duplicate flag sample.bam • off target flag
Coverage T T A C G G T A C T T G C A T G G T A C T T G C A T A G C T G A T T A C G G T A C T T G C A C G G T A C T T G C A T A G T A C G G T A C T T G C A T A G A T T A C G G T A C T T G C A T A G C T 5x coverage
Variant Calling A T T A C G G T G C T T G C A C G G T G C T T G C A T A G C G A T T A C G G T G C T G C A T A G C T - T T A C G G T G C T T G C A T G G T G C T T G C A T A G C T G A T T A C G G T G C T T G C A C G G T G C T T G C A T A G T A C G G T G C T T G C A T A G A T T A C G G T A C T T G C A T A G C T G = homozygous alternative
Variant Calling A T T A C G G T G C T T G C A C G G T G C T T G C A T A G C G A T T A C G G T A C T G C A T A G C T - T T A C G G T A C T T G C A T G G T G C T T G C A T A G C T G A T T A C G G T A C T T G C A C G G T G C T T G C A T A G T A C G G T G C T T G C A T A G A T T A C G G T A C T T G C A T A G C T A/G = heterozygous
Variant Calling G A T T A C G G T A C T T G C A C G G T G C T T G C A T A G T A C G G T G C T T G C A T A G A T T A C G G T A C T T G C A T A G C T A/G = heterozygous?
Variant Calling sequencing quality poor good G A T T A C G G T A C T T G C A C G G T G C T T G C A T A G T A C G G T G C T T G C A T A G A T T A C G G T A C T T G C A T A G C T G
Illumina: Normal flow cell technology MiniSeq MiSeq NextSeq500 HiSeq2500 2 x 150 b 2 x 300 b 2 x 150 b 2 x 125 b 6.6 Gb 13 Gb 100 Gb 450/900 Gb 22M clusters 22M clusters 0.4B clusters 2B/4B clusters 1 day 3 days 1 day 6 days 100k € 250k € 50k$ 700k$ 4250 $/WG 3500 $/WG
Illumina: Patterned flow cell technology HiSeq4000 HiSeqX Five HiSeqX Ten NovaSeq6000 2 x 150 b 2 x 150 b 2 x 150 b 2 x 150 b 0.65/1.3 Tb 0.8/1.6 Tb 0.8/1.6 Tb 0.85/1.7 Tb 2/4 B clusters 2.5/5 B clusters 2.5/5 B clusters 2.8/5.6 B clusters 4 days 3 days 3 days 2 days 10 x 1M € 1M € 900k$ 5 x 1.2M$ 2500 $/WG 1500 $/WG 1000 $/WG 1200 $/WG
Illumina: Patterned flow cell technology Patterned flowcell Billions of nanowells Extreme high density No overlapping clusters Special polymerase? ExAmp clustering primer swaps
• First Generation: a bit of history • Next (Second) Generation • Third Generation
Next Generation: Roche 454
Roche 454 • fragment DNA • clonal amplification on bead by emPCR • load beads in PicoTiterPlate • sequencing-by- synthesis
Ion Torrent
Ion Torrent • fragment DNA • clonal amplification on bead by emPCR • load beads on chip • sequencing-by- synthesis
• First Generation: a bit of history • Next (Second) Generation • Third Generation
Third generation sequencing = single molecule sequencing
Third Generation: PacBio - last week update: bought by Illumina RS Sequal
SMRT technology Library prep Circular DNA SMRT cell
PacBio • no DNA amplification • real-time imaging of DNA polymerase • sequencing-by- synthesis
SMRT technology >10kb reads 1 Gb output Better chemistry De novo assembly Haplotyping Variant calling Posted February 10, 2014 The Genomics Resource Center University of Maryland http://www.igs.umaryland.edu
Oxford Nanopore
Oxford Nanopore
Oxford Nanopore
Oxford Nanopore 6 bases in pore 6x base calling Caller development Community ACCCGTCCG
Oxford Nanopore High error rate, but major improvement in 2017…
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.