Data on OSG
Frank Würthwein OSG Executive Director Professor of Physics UCSD/SDSC
Data on OSG Frank Wrthwein OSG Executive Director Professor of - - PowerPoint PPT Presentation
Data on OSG Frank Wrthwein OSG Executive Director Professor of Physics UCSD/SDSC The purpose of this presentation is to give the Council a summary of where we are at with supporting data on OSG High Level Messages
Frank Würthwein OSG Executive Director Professor of Physics UCSD/SDSC
The purpose of this presentation is to give the Council a summary of where we are at with supporting data on OSG
October 3rd, 2017
High Level Messages
independent of jobs:
The services provided by OSG in support of data on OSG vary, depending on the size of the needs of the communities we deal with.
3
Benchmarking HTCondor Filetransfer
Initiated by GlueX and Jefferson Lab. Wanted to know if a single submit host at JLab can support glueX operations needs. Concern was primarily the IO in and out of the system. OSG did the test on our system. Then provided instructions for deployment at JLab. Then repeated test on their system, and helped debug until expected performance was achieved.
October 3rd, 2017
GlueX Requirements
5
Parameter GlueX Spec OSG Test Running jobs 20,000 4,000 Output Size 10-100 MB 250 MB Input Size 1-10 MB 1-10 MB Job Runtime 8h - 9h 0.5 h
O(n, l, s) = nJobs ∗ size length = 20000 ∗ 90 9 ∗ 3600 ≈ 55.5MB sec
GlueX specs translated into 55.5MB/sec and ~1Hz transaction rate
We tested x10 larger IO and x3 more transactions per second.
October 3rd, 2017
Benchmarking Result
10Gbps network bandwidth on the submit host.
communications with far away worker nodes.
6
Put and Get at 100Gbps
OSG offers installation instructions for deploying a cluster
connected, and seen by the clients as a single service, using Linux Virtual Server. This is the OSG strategy pursued for replacing SRM. It’s also what LIGO used for its first gravitational wave detection work on OSG.
October 3rd, 2017
Aside on Reducing Complexity
support for the LHC in order to sustain it with less effort in the future.
X509 from OSG.
8
October 3rd, 2017
Caching via StashCache
have privacy from each other.
that you have access to the StashCache deployed infrastructure.
9
September 7th, 2017
Data Server
XRootd
Data Server
XRootd
Data Server
XRootd
Xrootd Origin A
Data Server
XRootd
Data Server
XRootd
Data Server
XRootd
Xrootd Origin B
Data Server
XRootd
Data Server
XRootd
Data Server
XRootd
Xrootd regional cache
Data Server
XRootd
Data Server
XRootd
Data Server
XRootd
Xrootd regional cache
One Data Origin per Community Multiple caches across US
Applications connect to regional cache transparently. Regional cache asks redirector for location of file. Redirector redirects to relevant origin. File gets cached in regional cache.
10
This is a technology transfer from LHC with some OSG value added.
Xrootd OSG Redirector
Caches at: BNL, FZU, UNL, Syracuse, UChicago, UCSD/SDSC, UIUC.
October 3rd, 2017
Communities using StashCache
Comet@SDSC, and potentially elsewhere.
11
September 7th, 2017
Big Data beyond Big Science
OSG caching infrastructure used at up to ~10TB/hour for meta- or exo-genetics
12
October 3rd, 2017
StashCP Dashboard Info Last 3 Months
13
Dashboards Hosted at Kibana Instance at MWT2
October 3rd, 2017
StashCache Instances View 10/1 0:00 to 10/2 19:00
14
Details on data in/out, connections, errors, timeouts, retries, … for each cache are monitored.
October 3rd, 2017
Rucio and its use in Xenon1T
transfers between the experiment DAQ in Italy, and various disk locations in EU, Israel, and the US.
15
D a t a F l
( n
X e n
1 T D A Q x e 1 t
a t a m a n a g e r / d a t a / x e n
/ r a w /
i z e : 5 T B
a t a b u @ e r : D A Q U p l
d
B u
r
i z e : 5 T B
u @ e r f
R u c i
r a n s f e r s
u @ e r f
T a p e u p l
d
N I K H E F
( A m s t e r d a m )
S i z e : 2 T B I N 2 P 3
( L y
)
S i z e : 2 T B S t a s h / L
i n
( C h i c a g
S i z e : 3 T B M i d w a y / R C C
( C h i c a g
S i z e : 9 2 T B T a p e B a c k u p
( S t
k h
m )
S i z e : 5 . 6 P B
R u c i a x Rucio Server (Chicago) C a x
L N G S We i z m a n n
( I s r æ l )
S i z e : 8 T B
n
y e t u s e d
R u c i
t
a g e E l e m e n t D i s k s p a c e T a p e
Raw Data
R u c i
r a n s f e r s ( s i m u l t a n e
s ) T a p e u p l
d ( d
l
d ) R u c i a x u p l
d ( d
l
d )
Processed Data
C a x P r
e s s e d D a t a a n a l y z e d b y H a x
after joint evaluation with OSG.
expressed an interest in a similar evaluation.
metrics for eval project with LSST.
understand technical concept underlying Rucio.
A Future without X509
We already have eliminated X509 for user job submission in OSG. The two remaining use cases are: Pilots being authenticated at CEs Users staging out data to Storage Endpoints from jobs.
October 3rd, 2017
Problem Statement
stage out data to a Storage Endpoint from the worker node.
capability rather than personhood.
OSG-VOs storage endpoint(s).
this.
17
October 3rd, 2017
Initial “Demo”
generates SciToken
directory at Stash Endpoint using HTTPS protocol.
Xrootd server => This implies that data staged out can be used for subsequent processing via StashCache.
Facebook/Google/DropBox login.
18
October 3rd, 2017
Status of SciToken
understanding of viability of basic concept.
evaluation, and broader discussion exists.
for sharing.
19
October 3rd, 2017
Summary & Conclusion
OSG broadly for anybody.
support and/or are NSF funded projects.
complexity of the software stack required to use data
functionality now exists, and the geek gap between Big Science and the rest of scientific endeavors has shrunk, and continues shrinking.
20