1
LSST: Petascale opportunities and challenges
Tony Tyson, University of California, Davis
LSST: Petascale opportunities and challenges Tony Tyson, University - - PowerPoint PPT Presentation
LSST: Petascale opportunities and challenges Tony Tyson, University of California, Davis 1 2 Relative data volume from survey telescopes & cameras 1000 Etendue ( m2 deg2 ) 100 Max 10 Survey 1 4 The new sky Probing Dark Matter
1
Tony Tyson, University of California, Davis
2
4
1 10 100 1000 Etendue ( m2 deg2 ) Max Survey
Relative data volume from survey telescopes & cameras
The new sky
Probing Dark Matter and Dark Energy Mapping the Milky Way
Finding Near Earth Asteroids
6
Data volumes & rates are unprecedented in astronomy
5000 10000 15000 20000 GB Raw Catalog
Estimated Nightly Data Volume
LSST Pan-STARRS 4 SDSS
LS LSST w will m mak ake t ten ens of
phot
ervat ations ns o
ens of
billions of
NSF Review December 15-17, 2009 Tucson, AZ 7
DM System is widely distributed
Base Site
Base Center Co-located Data Access Center (DAC) Archive Center Co-located Data Access Center (DAC)
Archive Site Headquarters Site
Systems Operations Center (SOC) Education and Public Outreach Center (EPOC)
location/space that hosts DM centers
dedicated, protected fiber
capability hosted at a Site
8
DM System relies on large-scale computational parallelism
pipeline processing is “embarassingly parallel”
– 3024 parallel image readouts – O(108) sky tiles – O(109) objects
are well matched to the available parallelism
– 5000 cores at Base – 12000 (yr1) – 33000 (yr10) cores at Archive
flexible pipeline/production model
9
DATA PRODUCTS
10
DM Pipelines
Solar System Cosmology Defects Milkyway Extended Sources Transients
Base Catalog
All Sky Database
Instance Catalog Generation
Generate the seed catalog as required for simulation. Includes:
Metadata Size Position
Operation Simulation
Type Variability
Source Image Generation
Color Brightness Proper motion
Introduce shear parameter from cosmology metadata
DM Data base load simulation
Generate per FOV
Photon Propagation Operation Simulation
Atmosphere Telescope Camera Defects Formatting
Generate per Sensor
Calibration Simulation
LSST Sample Images and Catalogs
IMAGE SIMULATIONS
11
Full end-to-end simulations
12
The Data Challenge
that must be mined in real time.
monitored for important variations in real time.
developed for knowledge extraction in real time.
13
The Data Challenge
that must be mined in real time.
monitored for important variations in real time.
developed for knowledge extraction in real time.
14
LSST
15
LSST
Analytics
17
Science at the Limit
Much of the breakthrough science using surveys (imaging or spectroscopy) have
Sample incompleteness Systematic errors
20
21
clustering)
(classification)
(outlier detection) Benefits of very large data sets:
Tom Vestrand
Finding correlations and “fundamental planes”
Dimensionality !
– Are there combinations (linear or non-linear functions) of observational parameters that correlate strongly with one another? – Are there eigenvectors or condensed representations (e.g., basis sets) that represent the full set of properties?
23
This is required also for automated Data Quality Assessment
How To Learn More / Get Involved?
S T dat abase t rac at ht t p:/ / dev.lsstcorp.org/ trac/ wiki/ LS S TDat abase
S LAC)
ht t p:/ / www-conf.slac.stanford.edu/ xldb
hare your use cases, j oin t he communit y
Open conference starting this year 1st public release