The IceCube data pipeline: from the South Pole to publication
Jakob van Santen jakob.van.santen@desy.de PyData Berlin, 2016-05-21
The IceCube data pipeline: from the South Pole to publication - - PowerPoint PPT Presentation
The IceCube data pipeline: from the South Pole to publication Jakob van Santen jakob.van.santen@desy.de PyData Berlin, 2016-05-21 2 Deutsches Elektronen-Synchrotron (DESY) Zeuthen Helmholtz research institute with ~200 scientists, postdocs,
Jakob van Santen jakob.van.santen@desy.de PyData Berlin, 2016-05-21
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
2
Helmholtz research institute with ~200 scientists, postdocs, and students studying high-energy astrophysics with gamma rays and neutrinos Kosmos
What’s a neutrino? Why look for them at the South Pole? What are we trying to learn? How does IceCube find neutrinos?
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
5
Charged (electromagnetic interactions) Neutral (weak interactions
2.5e6 times less massive
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
6
Image: Wikipedia
Radioactive decay Nuclear reactors
Image: chemistryviews.com
The Sun
Image: N. Svoboda
Man-made particle accelerators
Image: CERN
Cosmic accelerators ~106 eV ~109 eV ~1015 eV Higher energy
What’s a neutrino? Why look for them at the South Pole? What are we trying to learn? How does IceCube find neutrinos?
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
8 [eV] E
13
10
14
10
15
10
16
10
17
10
18
10
19
10
20
10
]
sr
s
m
1.6
[GeV F(E)
2.6
E
1 10
2
10
3
10
4
10
Grigorov JACEE MGU Tien-Shan Tibet07 Akeno CASA-MIA HEGRA Fly’s Eye Kascade Kascade Grande IceTop-73 HiRes 1 HiRes 2 Telescope Array Auger Knee 2nd Knee Ankle
PRD 86: 010001 (2013)Something accelerates nuclei to macroscopic energies…
IceCube-59 Tibet-III
5 TeV 20 TeV
Abbasi et al., ApJ, 746, 33, 2012 Amenomori et al., ICRC 2011…but we don’t know what, or where! 1 Joule Neutrinos can point back to the cosmic accelerators!
What’s a neutrino? Why look for them at the South Pole? What are we trying to learn? How does IceCube find neutrinos?
Image: NASA Image: USAF
~2800 m of pure, clear ice
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
11 Photo: Haley Buffman/NSF
Main station IceCube Lab
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
13
IceCube Lab (data center) Digital Optical Module (single-pixel camera)
What’s a neutrino? Why look for them at the South Pole? What are we trying to learn? How does IceCube find neutrinos?
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
15
Data acquisition Feature calculation & event selection Simulation Analysis South Pole (real time)
Science! Challenges:
configure & extend data pipeline for many distinct science topics
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
16
Color ⇔ time Size ⇔ light intensity Neutrino Muon Interaction
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
17
million penetrating muons
neutrino events per year
select them!
10 milliseconds of raw data
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
19
Images: boost.org
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
20
In [1]: from icecube import icetray, dataio, dataclasses In [2]: f=dataio.I3File('hese.i3.bz2') In [3]: print f.pop_frame(icetray.I3Frame.DAQ) [ I3Frame (DAQ): 'CalibrationErrata' [DAQ] ==> I3Vector<OMKey> (137) 'FilterMask' [DAQ] ==> I3Map<string, I3FilterResult> (749) 'I3Geometry' [Geometry] ==> I3Geometry (401222) 'I3TriggerHierarchy' [DAQ] ==> I3Tree<I3Trigger> (616) 'OfflinePulses' [DAQ] ==> I3Map<OMKey, vector<I3RecoPulse> > (52917) 'PoleCascadeLinefit' [DAQ] ==> I3Particle (150) 'PoleMuonLlhFit' [DAQ] ==> I3Particle (150) 'PoleMuonLlhFitFitParams' [DAQ] ==> I3LogLikelihoodFitParams (68) 'PoleToIParams' [DAQ] ==> I3TensorOfInertiaFitParams (78) ]
“I3” file is a stream of serialized I3Frames boost::serialization provides load/save, object versioning, etc. I3Frame: dictionary of [immutable] C++ objects related to a single event
Flexible! Schema can change from event to event
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
21
I3Module I3Module I3Module I3Module
tray = I3Tray() tray.Add("I3Reader", filenamelist="foo.i3") tray.Add('HomogenizedQTot', Output='HomogenizedQTot', Pulses='OfflinePulsesHLC') tray.Add("I3Writer", filename="bar.i3") tray.Execute() Frame Frame Frame
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
22
class Counter(icetray.I3ConditionalModule): def __init__(self, context): super(Counter,self).__init__(context) self.AddParameter("Key", "Name of counter to put in the frame", "Count") def Configure(self): self.key = self.GetParameter("Key") self.counter = 0 def Physics(self, frame): frame[self.key] = icetray.I3Int(self.counter) self.counter += 1 self.PushFrame(frame) tray.Add(Counter, Name="CountCount")
Prototype rapidly in Python, rewrite in C++ as needed
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
23
3000 events/s 1 TB/day 300 events/s 100 GB/day IceCube Lab Satellite relay IceCube Data Warehouse (Madison, WI) 4 PB and counting
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
24
millions of CPU and GPU hours
academic grids in US and Europe with HTCondor glide-ins, custom Python middleware
hat variant)
CVMFS (HTTP-based read-
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
25
I3Frame
I3ParticleConverter I3FilterMaskConverter I3TableRow I3TableRow
HDF5 ROOT
Format-specific backend Event data Specific coercion for each object Abstract table row
partial reads
reading the same data over and
irregular event data into table rows
pytables, pandas, h5py, etc.
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
27
101 102 103 104 Number of collected photons 100 101 102 103 104 105 106 107 108 109 Events per year
Penetrating muons Pre-selection
Blindness
Most IceCube analyses use binned data Pro
straightforward to calculate with Monte Carlo
Con
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
28
numpy.histogramdd()-backed histogram
slice, project, etc.
https://github.com/emiddell/dashi
# create & fill 3d histogram h = dashi.histogram.histogram(3, (linspace(0, 1, 101),)*3) h.fill(get_3d_data()) # project out dimension 1 h.project([0,2]) # plot a 1-d slice h[1,1,:].line(differential=True) # store for later with tables.open_file('foo.hdf5', 'a') as hdf: dashi.histsave(h, hdf, '/', 'my_histogram')
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
29 μ νμ
μ Veto
Veto Simple event selection based
detector volume 28 events survived in 2 years of data
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
30
20 40 60 80 102 103 Declination (degrees) Deposited EM-Equivalent Energy in Detector (TeV) Showers Tracks
IceCube Preliminary
Bin data in observable space, compare counts to predicted mean in each bin A: < 5e-7 (discovery!) → doi:10.1126/science.1242856 Energy Zenith angle Q: What is the chance that the data is a fluctuation of the background?
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
31
the South Pole.
and analysis tools.
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
33
Come to Berlin’s science open house in Adlershof on July 11! http://www.langenachtderwissenschaften.de
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
35
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
36
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
37
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
38
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
39
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
40
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
41
Jakob van Santen - The IceCube data pipeline - jakob.van.santen@desy.de
42