HPC/Blue Waters Role in the Dark Energy Survey Data Management Don - - PowerPoint PPT Presentation

hpc blue waters role in the dark energy survey data
SMART_READER_LITE
LIVE PREVIEW

HPC/Blue Waters Role in the Dark Energy Survey Data Management Don - - PowerPoint PPT Presentation

HPC/Blue Waters Role in the Dark Energy Survey Data Management Don Petravick Senior Project Manager National Center for Supercomputing Applications BW summary Incorporated BW into an overall Data Management System. Completed a


slide-1
SLIDE 1

HPC/Blue Waters’ Role in the Dark Energy Survey Data Management

Don Petravick Senior Project Manager National Center for Supercomputing Applications

slide-2
SLIDE 2

BW summary

  • Incorporated BW into an overall Data Management System.
  • Completed a crucial Weak Lensing calculation in 2 weeks on BW,
  • where the alternative for us was 6 months.
  • Uses BW at a lessor level for other purposes in the system.
  • Includes making BW ready for the crucial calculation.

6/5/19 DLP DES and Blue Waters 2

slide-3
SLIDE 3

What is the Dark Energy Survey

  • Goal :Constrain the characterization of Dark Energy using 4 probes:
  • Galaxy Clustering
  • Weak Lensing
  • Large Scale Structure
  • Supernovae
  • Plan: Two Surveys:
  • Wide Field Survey in grizY, 5000 Deg2
  • SNE survey griz 30 deg2
  • Over 5.5 years.
  • Instrumentation
  • 4 m Blanco Telescope, CTIO.
  • DECam 512 Megapixel, 3 deg2

512 MP DECam during its fabrication at Fermilab

6/5/19 DLP DES and Blue Waters 3

slide-4
SLIDE 4

Who is the Dark Energy Survey

More than 400 scientists from U.S. Department of Energy, the United Kingdom, Spain, Brazil, Germany, and Switzerland. NCSA Observation Data Production Knowledge DESDM Group: Research Scientists, Operations staff. Technical services from overall NCSA staff. Pipeline contribution from many in the collaboration. DES: Rotating DES observing teams, FNAL: DECam Support. CTIO site: Telescope and instrument support.

6/5/19 DLP DES and Blue Waters 4

slide-5
SLIDE 5

DESDM High- Level Pipeline Archi- tecture

High Level overview of DESDM pipelines Credit Eric Morganson

6/5/19 DLP DES and Blue Waters 5

Prompt Batch

slide-6
SLIDE 6

Technical Services Architecture

NCSA Storage Condo Blue Waters Illinois Campus Cluster Fermigrid NCSA Storage Condo Oracle RAC Blue Waters Illinois Campus Cluster Fermigrid NCSA Storage Condo Oracle RAC Collaboration Access Services Offline Processing – Campaign; Goal: Throughput

  • All the rest

(ongoing) Nightly Processing -Goal -- Availability

  • SNE processing
  • First Cut

(now done)

6/5/19 DLP DES and Blue Waters 6

slide-7
SLIDE 7

DESDM Job Management -Common pattern

Campaign Nanny Node runs a pipeline instance Files and DB Tables Files and DB Tables Pipeline Nanny for One Pipeline Segment Submit Glide in Job for Many Nodes A Free Node Batch Job Ends ?

6/5/19 DLP DES and Blue Waters 7

slide-8
SLIDE 8

BW integration topics.

Goal – Satisfy needs at a scale beyond Illinois Campus Cluster and Fermigrid with minimal framework differences. The primary challenges

  • The large number of outbound connections DESDM Jobs make due to
  • Condor Framework
  • DB integration (upload detected objects, general status).
  • Many small jobs – trivially parallel at scales of 1000-2000.
  • File system load – community code integrations – “Hostile” to framework.
  • Pipeline modules use file system for inputs and outputs.
  • Many supplemental files.

6/5/19 DLP DES and Blue Waters 8

slide-9
SLIDE 9

Single Epoch to Science-Ready Images

False color Images depicting raw (defects exaggerated) and processed image) Modified from

  • riginal by Felipe Menanteau.

6/5/19 DLP DES and Blue Waters 9

slide-10
SLIDE 10

Difficulties of Weak Lensing

Nature of the weak lensing signal from one galaxy. Credit: Felipe Menanteau Not shown are instrumental effects, such as variation of the PSF over the focal plane, These need to be characterized, and accounted for in the Weak Lending codes. An example of strong lensing

  • The process of co-addition

degrades the weak lensing signal present in the data.

  • Weak lensing codes consider all

the individual image simultaneously, guided by a co- added detection images.

  • DES weak lensing codes are a

the state of the art.

6/5/19 DLP DES and Blue Waters 10

slide-11
SLIDE 11

BW and DESDM

BW capacity is crucial for DES weak lensing processing, and able to provide a large amount of computing resources needed due to the intrinsic difficulty of the method the and the state of the art of these codes.

  • Achievement:
  • Production run was 2 weeks
  • 6 months estimated on other infrastructure available to DESDM.
  • Usage ~3 million core hours
  • Codes: Multi-object fitting
  • Observations included: Science verification though year 3.

Other uses:

  • Usage: 1 million core hours for other DESDM data products
  • Codes: single epoch and co-addition
  • Observations: Varied, general resource complimenting
  • Illinois campus cluster
  • Fermigrid

6/5/19 DLP DES and Blue Waters 11

slide-12
SLIDE 12

Other uses of BW by DESDM

Recall that BW is integrated into an overall system that can use many bulk computing resrources. BW is also used when

  • DESDM has many campaigns
  • Other compute resources are unavailable (mainentance, upgrades)
  • Summary:
  • Usage: 1 million BW core hours for other DESDM data products
  • Codes: single epoch and co-addition
  • Observations: Up to and including Year 5.5.

6/5/19 DLP DES and Blue Waters 12

slide-13
SLIDE 13

`

HTC, HPC, and Cloud Native Style Elements in DESDM

NCSA Storage Condo Blue Waters Illinois Campus Cluster Fermigrid NCSA Storage Condo Oracle RAC Blue Waters Illinois Campus Cluster Fermigrid NCSA Storage Condo Oracle RAC Collaboration Access Services Offline Processing – Campaign; Goal: Throughput

  • All the rest

(ongoing) Nightly Processing -Goal -- Availability

  • SNE processing
  • First Cut

(now done)

6/5/19 DLP DES and Blue Waters 13

slide-14
SLIDE 14

The storage system is the technical basis for co- existence of the HPC, HTC, and cloud-native cultures.

  • In the opening talk, the speaker mentioned that
  • HPC people publish and talk to each other
  • AI/Cloud Native infrastructure people meet and talk to each other
  • But the two groups hardly interact.
  • In DESDM the data is in a neutral storage systems primarily accessed by

services.

  • GPFS Posix file system (3.5 PB)
  • A VM infrastructure with excellent access to the storage resources, an integral part of

the storage condo.

  • A large relational database (500 TB usable table space)
  • The Neutral storage system is the technical basis for co-existence of the

HPC, HTC, and cloud-native cultures.

6/5/19 DLP DES and Blue Waters 14

slide-15
SLIDE 15

DES Labs: Collection of containerized tools for DES access

15

  • Used by the DES

Collaboration and general public.

  • Over 1000 users
  • Running at NCSA using

Kubernetes and NCSA cloud

  • Data access, exploration

and visualization

  • AI models for anomaly

detection and similarity search

slide-16
SLIDE 16

NCSA DESacces: Deployment

16

slide-17
SLIDE 17

Summary

  • BW was crucial resource for DESDM’s most cycle intensive processing

needs.

  • BW also plays a role for ordinary processing in DES.
  • DES has ~8,000,000 CDD level Images.
  • BW was used to process over 1,000,000 DECam images for non DES processing at

NCSA, in other BW allocations.

  • BW was able to integrate into a processing framework more like High

Energy Physics experiments use:

  • Based on HT-Condor
  • Extensive Transfers of data into and out of BW.
  • BW Support staff have been excellent.

6/5/19 DLP DES and Blue Waters 17