Global patterns of copy number variation in humans from a population-based analysis.
ICHG Kyoto
Jean Monlong April 5, 2016
BOURQUE LAB MCGILL UNIVERSITY HUMAN GENETICS DEPT.
Global patterns of copy number variation in humans from a - - PowerPoint PPT Presentation
Global patterns of copy number variation in humans from a population-based analysis. ICHG Kyoto Jean Monlong April 5, 2016 B OURQUE L AB M C G ILL U NIVERSITY H UMAN G ENETICS D EPT . Disclosure Information I have no financial relationships
ICHG Kyoto
BOURQUE LAB MCGILL UNIVERSITY HUMAN GENETICS DEPT.
2
Copy-Number Variation 3
Copy-Number Variation 4
Copy-Number Variation 5
Baker 2012, Nature Methods.
Copy-Number Variation 6
Copy-Number Variation 6
2006).
Genomics 2008).
Copy-Number Variation 6
2006).
Genomics 2008).
Genetics 2016).
Nature 2007).
PopSV approach 7
PopSV approach 8
genomic window number of reads mapped
sample reference tested
PopSV approach 9
PopSV approach 10
PopSV approach 11
CNV patterns in normal genomes 12
CNV patterns in normal genomes 13
CNV patterns in normal genomes 13
CNV patterns in normal genomes 13
CNV patterns in normal genomes 14
0.00 0.25 0.50 0.75 1.00 0e+00 2e+07 4e+07 6e+07
distance to centromere/telomere/gap (bp) cumulative proportion region
CNV control
CNV patterns in normal genomes 15
CNV patterns in normal genomes 16
CNV patterns in normal genomes 16
CNV patterns in normal genomes 17
CNV patterns in normal genomes 17
CNV patterns in normal genomes 18
CNV patterns in normal genomes 19
Conclusion 20
Conclusion 21
Conclusion 21
23
24
25
26
0.00 0.25 0.50 0.75 1.00 PopSV low expected high
coverage class proportion of regions with concordant samples
set call null 0.00 0.25 0.50 0.75 1.00 PopSV [0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1]
GC content proportion of regions with concordant samples
set call null 0.00 0.25 0.50 0.75 1.00 PopSV [0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1]
segmental duplication proportion proportion of regions with concordant samples
set call null 0.00 0.25 0.50 0.75 1.00 PopSV [0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1]
simple repeat proportion proportion of regions with concordant samples
set call null
27
0.2 0.4 0.6 0.8 1652−Mother 1652−Father 1652−Twin1 1652−Twin2 1480−Mother 1480−Twin2 1480−Twin1 1389−Mother 1389−Twin1 1389−Twin2 1207−Mother 1207−Father 1207−Twin1 1207−Twin2 1286−Father 1286−Twin1 1286−Twin2 1286−Mother 1389−Father
1301−Father 1480−Father 1323−Father 1301−Mother 1301−Twin1 1301−Twin2 1323−Mother 1323−Twin1 1323−Twin2
1443−Mother 1443−Father 1443−Twin2 1443−Twin1 1121−Mother 1121−Father 1121−Twin1 1121−Twin2
1490−Father 1490−Mother 1490−Twin1 1490−Twin2
sample
Mother Twin family ●
1207 1286 1301 1323 1389 1443 1480 1490 1652
PopSV
28
0.00 0.25 0.50 0.75 1.00 2500 5000 7500 10000 12500 15000 17500 20000
size of the 500bp−bin call proportion overlapping 5kbp−bin calls
29
0.00 0.25 0.50 0.75 1.00 l
m a p s e g d u p
feature proportion overlapping the feature set
CNV control
QC − SD, low−coverage and CTG distance control
30
0.00 0.25 0.50 0.75 1.00 0e+00 2e+07 4e+07 6e+07
distance to centromere/telomere/gap (bp) cumulative proportion region
CNV control
QC − SD, low−coverage and CTG distance control
31
S/2 S/2 S/2 S/2 S/2 S/2
= Random base in green S Random region of size S
32
DNA LINE LTR SVA SINE AluY HERVH−int L1HS L1PA2 L1PA3 L1PA4 L1PA5 LTR38−int LTR4 MER65A SVA_D SVA_E SVA_F TE TE top families Twins CK Normal GoNL CK Somatic
cohort Significance (−log10 Pvalue)
4 8 12 Depleted Enriched
33
(CATTC)n (GAATG)n ACRO1 ALR/Alpha BSR/Beta CER D20S16 GSAT GSATII GSATX HSAT4 HSAT5 HSAT6 HSATI HSATII LSAU MSR1 REP522 SAR SATR1 SATR2 SST1 SUBTEL_sa TAR1 AAAG AAGA AAGG AGAA AGAT AGGA AT ATAG ATCT CAG CCTT CGC CTG CTTC CTTT GAAA GAAG GATA GCG GGAA TA TAGA TATC TCCT TCTA TCTT TTCC TTCT TTTC
Satellite STR Twins CK Normal GoNL CK Somatic
cohort
Depleted Enriched
Significance (−log10 Pvalue)
4 8 12
34
0.00 0.25 0.50 0.75 1.00 0e+00 2e+07 4e+07 6e+07
distance to centromere/telomere/gap (bp) cumulative proportion region
CNV control