data approach to integrative evolutionary histories Fabia U. - - PowerPoint PPT Presentation

data approach to integrative
SMART_READER_LITE
LIVE PREVIEW

data approach to integrative evolutionary histories Fabia U. - - PowerPoint PPT Presentation

CENTER FOR DATA SCIENCE SCIENCE Strength in Numbers AND BIG D BIG DATA ANALYTICS Earth, genomes, and time: a big data approach to integrative evolutionary histories Fabia U. Battistuzzi Biological Sciences battistu@oakland.edu Dec.


slide-1
SLIDE 1

CENTER FOR

DATA SCIENCE SCIENCE

AND

BIG D BIG DATA

ANALYTICS

Strength in Numbers

Earth, genomes, and time: a big data approach to integrative evolutionary histories

Fabia U. Battistuzzi Biological Sciences battistu@oakland.edu

  • Dec. 1, 2016
slide-2
SLIDE 2

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Nothing in biology makes sense except in the light of evolution

  • T. Dobzhansky

Evolution

Past Future Present

  • Past history of life is a predictor of current and

future changes

  • Medical field
  • Climate science
  • Astrobiology
  • Conservation biology
  • Sustainable energy
slide-3
SLIDE 3

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Nothing in biology makes sense except in the light of evolution

  • T. Dobzhansky
  • Genomes are repositories of billions of data

points (DNA bases)

  • Human genome: 3 billion DNA base pairs
  • 7 billion individuals on Earth
  • 2.1 e+19 base pairs
  • Species estimates: 10 million to 1 trillion
  • Many will be much smaller than us (< 1

million base pairs)

  • Many are much larger than us (up to

~150 billion base pairs)

slide-4
SLIDE 4

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Nothing in biology makes sense except in the light of evolution

  • T. Dobzhansky
slide-5
SLIDE 5

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Nothing in biology makes sense except in the light of evolution

  • T. Dobzhansky
  • Comparative genomics
  • How and where did life originate
  • And where should we look for other

life (Astrobiology)

slide-6
SLIDE 6

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Nothing in biology makes sense except in the light of evolution

  • T. Dobzhansky
  • Comparative genomics
  • How did life survive on Earth through

major environmental changes

  • Microbes are the longest living

lineages on Earth (~4 billion years)

  • They survived and thrived during

planetary-scale climate changes (conservation and sustainability)

slide-7
SLIDE 7

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Nothing in biology makes sense except in the light of evolution

  • T. Dobzhansky
  • Comparative genomics
  • How do pathogens escape our immune

system and drugs

  • What changes at the genomic level

allow them to adapt?

  • How do pathogens arise and spread?
slide-8
SLIDE 8

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Evolution in the Blab

  • Early life evolution
  • How to accurately reconstruct the evolutionary

histories of microbes on Earth

  • Conditions for life to thrive
  • Adaptations that sustained microbial life through climate

changes

  • Rate of species diversification
slide-9
SLIDE 9

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Evolution in the Blab

  • Evolution of malaria
  • How does it adapt to humans and other

hosts?

  • Are genomes evolving differently depending on the

host?

  • Are genes involved in antimalarial drug resistance

evolving faster?

AT-rich AT-balance

slide-10
SLIDE 10

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Evolution in the Blab

  • Students involvement
  • Undergrads & grads (current size: 11+1 students)
  • Skills
  • Scripting/programming
  • Phylogenetics
  • Comparative genomics
  • Data mining
  • Statistics
slide-11
SLIDE 11

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

New opportunities with CDaS

  • Explore new statistical applications for Big Data
  • Systematic bias
  • False discovery rates

Kumar et al 2012

slide-12
SLIDE 12

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

New opportunities with CDaS

  • Connect comparative and functional genomics
  • Text mining of functional databases
  • Integration of multiple databases
slide-13
SLIDE 13

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

New opportunities with CDaS

  • Explore new strategies to gain computational support
  • On-site high-performance computational cluster (working on it…)
  • Cloud-based and off-site clusters (exploring options…)
  • Programming, database architecture
slide-14
SLIDE 14

CENTER FOR DATA SC SCIENC IENCE AND BIG BIG DATA A ANALYTICS

Strength in Numbers

Contact info

battistu@oakland.edu 340 Dodge Hall Biological Sciences Oakland University