Grid Activities at High Energy Accelerator Research Organization - - PowerPoint PPT Presentation
Grid Activities at High Energy Accelerator Research Organization - - PowerPoint PPT Presentation
Grid Activities at High Energy Accelerator Research Organization (KEK) May 04. 2006 KEK Computing Research Center Setsuya Kawabata ISGC2006 at ASGC Outline 1. LCG Testbed Collaboration for Grid deployment with ICEPP Testbed 2.
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 2
Outline
- 1. LCG Testbed
Collaboration for Grid deployment with ICEPP Testbed
- 2. Grid CA at KEK
- 3. Belle experiment
Status of Experiment Data analysis Activity for Grid New B Factory Computer System
- 4. Deployment Plan
New Central Information System
- 5. Lattice QCD Grid
New Super Computer System at KEK
- 6. Collaboration with NAREGI
- 7. Summary
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 3
- 1. LCG Test bed
Since Nov. 2001, KEK and ICEPP have collaborated to study and experience the function of the Regional center facility.
experienced NorduGrid before LCG available. HPSS performance test in NorduGrid environment . . .
ATLAS Tier-2 Center at ICEPP, U. of Tokyo
Major facility will be installed in FY.2006. ⇒Prof. Sakamoto’s talk
KEK LCG test bed developed since Sep. 2005
In collaboration with ICEPP Learning and technical skill update on the LCG middleware (LCG 2.6 -) Implementation of Atlas software for muonTrigger Simulation Organized LCG installation training course 17-19 Nov. 2005 at KEK
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 4
LCG-2.6 Testbed
- Functional test of middleware
- Performance measurement of data sharing
- Atlas simulation software was installed to
demonstrate regional resource sharing
- Parallel processing of Geant4 simulator with
MPI
UI CE_torque SE_dpm, LFC_mysql BDII, RB, MON, PX VOMS WN1 UI-VOMS / WN1 CE_torque / WN2 SE_dpm, MON BDII / WN3 PX / WN4 RB / WN5 LFC_mysql / WN6 WN2 CE_lsf
ICEPP/U. Tokyo KEK VO: Atlas_j, rcrd VO: Atlas_j, g4med PC farm ~ 50 CPU SuperSINET(1Gbps) Dual cpu pc’s JST Cluster
Private network
Opteron Dual x 20 nodes
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 5
- 2. KEK Grid CA
KEK submitted a GRID CA application to APGRIDPMA in Nov. 2005. KEK Grid CA was approved by APGRID PMA and in production since January 2006.
3rd production GRID CA in Japan NAREGI CA software was modified to use at KEK. https://gridca.kek.jp/
KEK employees and their collaborators are eligible for this service.
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 6
- 3. Belle Experiment
Belle Exp.
B meson Factory using world highest luminosity accelerator: KEKB e+e- Collider
Accelerator:
Luminosity:
Peak 1.627×10 34 cm -2 s -1 e+ (3.5GeV) ~ 2.0 A ; e- (8GeV) ~ 1.36A
Continuous injection from Linac Improved banch-banch interference and electron cloud Luminosity will be much more improved by Crab Cavity in 2006.
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 7
Belle Detector
μ / KL detection 14/15 lyr. RPC+Fe Tracking + dE/dx small cell + He/C2H5 CsI(Tl) 16X0 Aerogel Cherenkov cnt. n=1.015~1.030 Si vtx. det. 3 lyr. DSSD TOF counter SC solenoid 1.5T 8GeV e− 3.5GeV e+
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 8
Integrated Luminosity Trend
“Super-KEKB“
L= 1036 Present KEKB L= 1.5x1034
2002 03 04 05 08 07 06 09 10 11
∫dt = 1,000fb−1
larger beam current smaller βy* long bunch option crab crossing
Constraint: 8GeV x 3.5GeV wall plug pwr.< 100MW crossing angle< 30mrad
∫dt (fb-1)
1,50 0 1,0 0 0 50 0
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 9
Experimental Data and its Sharing among Belle Collaboration by SRB Data accumulated so far
1.5 PB including Simulation data Recent data acquisition rate ~ 1.0TB/day
SRB servers for real data storage system has been implemented in Aug. 2005. Current active data sharing
among KEK, U. Melbourne (Australia), and Nagoya Univ. Target storage space 120 TB
files registered to MCAT ~ 423 files as of Sep. 6
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 10
The Belle SRB system
HDD MCAT KEK network Tape Library NFS HSM-DISC HSM Server SRB server Router MCAT
federation
MES server MES server at ANU Internet Belle FW KEK FW TOHOKU Univ. SRB server SRB server at Melbourn-Univ DB server MCAT SRB server MES server DB server NAGOYA Univ. Super SINET RAID
Australia
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 11
Pre-production LCG site for Belle
(JP-KEK-CRC-01)
Pre-production site was built up with LCG2.7 March, 2006 Certification by APROC for registration to GOCDB has been done at the end of March New VO: Belle has been registered to the LCG/EGEE as a global VO. Initial collaboration sites expected:
Melbourne, ASGC, Krakow, Jozef Stefan Institute (Slovenia), IHEP Vienna Nagoya U.
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 12 Registory RB CE SE MON VOMS UI PX/BDII LFC CE Disk Server
LCG2.7 Node
Site: JP-KEK-CRC-01 VO: Belle, dteam
Not ready
Service web site: http://hepdg.cc.kek.jp/service/.
WN Farm-1 WN Farm-2 .keklcg.jp
192.168.1.0
1.4 TB RAID 8-03-2006
DNS,NAT ce01.keklcg.jp(192.168.1.1) se01.keklcg.jp(192.168.1.2) mon.keklcg.jp(192.168.1.3) ce02.keklcg.jp(192.168.1.4)
(Not ready)
wn001~wn014
Internet Internet
Router/SW
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 13
New B Factory Computer System
80+16FS 11 3+(9)
Work Group server (# of hosts )
128PC 23WS +100PC 25WS +68X
User Workstation (# of hosts)
3,500 (3.5PB) 620 160
Tape Library Capacity (TB)
1,000 (1PB) ~9 ~4
Disk Capacity (TB)
~42,500
(PC)
~1,250
(WS+PC)
~100
(WS) Computing Server (SPECint2000 rate)
2006-(6years) 2001-(5years) 1997-(4years)
Performance\Year Moore’s Law 1.5y=twice 4y=~6.3 5y=~10
New B Factory Computer System since March 23. 2006 History of B Factory Computer System
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 14
KEK-DMZ 130.87.104.0/22(*.kek.jp) Login Web Internet Internet KEK-FB 130.87.224.0/21(*.kek.jp) RADIUS Web NTP・DNS・SMTP
80
WG SUS
KEK FireWall L3 SW
130.87.192.0/24(*.kek.jp)
128
UWS SQL APP CVS SRB
10
Grid ITA Windows LDAP
16
WF
5
BK 172.22.28.0/24
Core SW
1140
SC 172.22.32 - 43.0/24 10.34.32 - 43.0/24 Huge Storage System High-Speed Data transfer system DAQ Server @ Exp. Hall
SC :Computing Server WG :Work Group Server BK :Backup Server WF :Work File Server UWS :User Work Station SUS :Software Update Server ITA :Administration Server
LSF
Administration Net Work Group Server Net Data Transfer Network Outer net SW
Super- SINET Super- SINET
とう
Nagoya,Tohoku,Tokyo I nst,Tokyo Univ. GRID Server AI ST 130.87.194.0/24 KEK-BC Internet Internet
New B computer system
Belles Farm 172.17.X.X
KEK FireWall
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 15
New B Factory Computer System
Computing Server (CS)
CS+WG servers(80) = 1208 nodes=2416CPU =45,662 SPEC CINT 2k rate =8.7THz DELL Power Edge 1855 Xeon3.6GHz x2, memory 1GB Linux (CentOS/CS,REL/WGS) 1 Enclosure = 10 nodes/ 7U space 1 rack = 50 nodes 25 racks = 4 arrays
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 16
New B Factory Computer System
Storage System (SS) –disk-
1,000TB, 42FileServ. Nexan + ADTeX +
SystemWks SATAII 500G dr.
×~2000
(~1.8 failures/day?)
HSM = 370TB non HSM (no Bck) = 630TB
← Nexan SATABeast 42dr/4U/21TB ←ADTeX Array Master 15dr/3U/8TB
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 17
Storage System (SS)
- tape-
HSM
3.5PB + 60drv + 13srv SAIT 500GB/volume 30MB/s drive Petaserv(SONY)
WFS backup
90TB + 12drv + 3srv LTO3 400GB/volume NetVault
New B Factory Computer System
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 18
New B Factory Computer System
System usage
User Workstations (PC) are used for the network terminal. Users login to Work Group Server(WGS; 4~5persons/host). 1208 Computing Servers divided into 3 LSF (Batch System) clusters. WGS(80svr) shares WFS(16srv+80TB) as NFS user home directories. Not-so-frequently-modified applications/Libraries are hold in 50 NFS
- servers. 1140 CS shares the NFS server.
- Exp. Data (Many and Big size)
is transferred between CS(1140) and Storage Servers(42) by using a Belle self-made simple TCP/socket application. Data is managed by cooperation with DB system.
Storage System ⇔Computing server transfer performance spec.
CS/WGS 1/3(540) ⇔ SS = 10GB/s CS 2/3(1080) ⇔ SS/HSM = 0.5GB/s SS/HSM ⇔ SS/nonHSM = 0.5GB/s
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 19
- 4. LCG Deployment plan at KEK
New Computer Systems
Central Information System since Feb. 20. ’06 B Factory Computer System since Mar. 23. ‘06
1st Phase
LCG and SRB for production usage are available on the Grid System in the new Central Information System. Not for public usage, but for supporting projects Under system maintenance in contract with IBM-Japan WN: 36 nodes x 2 =72 CPU Storage: Disk (2TB) + HPSS(~200TB) Supported VO: Belle, APDG, Atlas_J Service start in ~ May 2006
2nd Phase
Full support in the Belle production system
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 20
Central Computing System : KEKcc
Work Server, Computing Server
Program development, Job submission, etc. AMD Opteron252 2.6GHz Dual Red Hat Enterprise Linux Platform LSF HPC
Software
CERNLIB, Geant4, CLHEP and ROOT Monte Carlo Simulation codes
Storage System
Disk storage 45 TBytes, HPSS 200TBytes
Mail System : KEKmail
PostKEK (for Research div.), MailKEK (for administration div.) ML, Anti-SPAM, anti-Virus
Web Systems
KEK Official system, Researchers’ system, Conference system
Grid System
LHC data Grid LCG, Storage Resource Broker SRB
Central Information System
From 2006 February 20th
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 21
KEK Central Information System
DMZ
LAN
Data-LAN
FW Access Server
IBM xSeries 336 x 16 CPU: Intel Xeon 3GHz MEM:1GB
The Internet/SINET
Batch-LAN
Computing Server
IBM eServer 326 x 64 CPU:AMD Opteron252 2.6GHz x2 MEM:4GB Admin-LAN
Work Server
IBM eServer 326 x 12 CPU:AMD Opteron252 2.6GHz x2 MEM:4GB
Linux Configuration Server
IBM xSeries 345 x 1 CPU: Intel Xeon3GHz MEM: 1GB MAIL-LAN
Web System
IBM pSeries 520 x 2 CPU: 2-way 1.5 GHz POWER5 MEM: 8GB
Agenda Web
IBM xSeries 336 x 2 CPU: Intel Xeon 3GHz MEM: 1GB
GRID-LAN
Mail System(POST/MAIL)
IBM pSeries 570 x 2 IBM pSeries 510 x 2 IBM xSeries 336 x 16 IBM DS4300/DS4100
GRID System
LCG System IBM xSeries 336 x 10 IBM eServer 326 x 36 IBM pSeries 520 x 1 SRB System IBM xSeries 255 x 1 AFS System IBM xSeries 336 x 3 Library System
IBM pSeries 520 x 2
Software Server
IBM pSeries 520 x 1
Print Server and Printers
IBM eServer 226 x 2 Fuji XEROX DocuPrint C2426 x 26 Fuji XEROX DocuPrint C3140 x 9 FXPS Phaser 8550DP x 11 HP Designjet 5500psUV x 1
NTP Server
IBM eServer 336 x 3 SEIKO TS2530(GPS) x 2
PC Configuration System
IBM xSeries 336 x 2 /346 x 5
X Terminal
IBM IntelliStation M Pro Model3J9 x 10
Shared PC
IBM Thinkpad R52 x 15 Apple PowerBookG4 x 5
Web Staging Server
IBM pSeries520 x 2 CPU: 2-way 1.5 GHz POWER5 MEM: 8GB
Central Computing System (KEKCC)
Development System
IBM pSeries 520/510 IBM xSeries 336/346/236 IBM eServer 326 IBM DS4100 3590 Tape Drive
server pSe ries IBM H C R U6 I B M server IB M ser ver p S e ri e s server pSerie s IBM H C R U6 IB M serverGRID CA Access Server
IBM xSeries 226 x1
GRID CA Server
IBM xSeries 226 x1
Disk Storage
IBM pSeries520 x 7 CPU: 2-way 1.65 GHz POWER5 MEM:4GB/1GB IBM DS4300(45TB) IBM 3584 Tape Drive
HPSS
IBM pSeries 520 x4 /550 x3 CPU: 2-way 1.65 GHz POWER5 MEM:2GB/4GB IBM xSeries 346 x1 CPU: Intel Xeon3GHz x2 MEM: 2GB IBM DS4300(10TB) IBM 3494 Tape Drive(320TB)
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 22
KEK CRC Production GRID Overview
- LCG System - (JP-KEK-CRC-02)
LCG CE (SLC3) rlc01~rlc12 File server + DISK (AIX5L 5.3) HPSS LCG Computing Eelements (SLC3) rlc13~rlc36 CE2 CE1 WN WN SE (Classic) tests RB/BDII PX/BDII VOMS LFC UI MON/GridICE rls09 rls10 GRID LCG servers (Scientific Linux CERN) rls01~10
IBM xSeries 336 ( 10 Unit )
- EM64T Xeon 3.0GHz x 1
- 1024MB RAM
- 36.4GB HDD x 2 (RAID-1)
- Gigabit Ethernet x 2 + 1
OS : Scientific Linux CERN 3.0.6
IBM xSeries 326 ( 36 Unit )
- AMD Opteron 262 2.6GHz x 2
- 4096MB RAM
- 36.4GB HDD x 2 (RAID-1)
- Gigabit Ethernet x 2
OS : Scientific Linux CERN 3.0.6
2TB
200TB
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 23
KEK CRC Production GRID Overview
- SRB System -
HPSS 200TB GRID LCG file server + DISK (AIX5L 5.3)
NFS/VFS
SRB server rsr01 srbcert RP SRB Master PostgreSQL UI rac01 SRB Client SRB Server NFS HOME
/home /work
for MCAT
HSI (API)
- 5. Lattice QCD Data Grid
Lattice QCD data In lattice QCD, gauge field configuration is essential data Generated by Monte Carlo algorithms
Full QCD requires large computational resources
Once generated, various correlation functions can be measured
Hadron spectra, decay constants, matrix elements, etc. Exotic hadrons, interaction between hadrons
Data size
Degree of freedom: SU(3)×site(x,y,z,t)×direction(4), statistics O(1000)
Various actions, sizes, parameters
Extrapolations to continuum, small quark mass, large volume limits Comparison for consistency check
Sharing gauge configuration is now world movement
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 25
HEPnet-J/sc HEPnet-J/sc
Domestic network for HEP theory SuperSINET (NII) Connecting measure lattice QCD sites KEK, Tsukuba, Osaka, Kyoto, Kanazawa, Hiroshima File mirroring
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 26
New Supercomputer System
From 2006 March 1st
Large Scale Simulation for particle and nuclear physics research and accelerator-related scientific studies
Hitachi SR11000 K1 System, 16 nodes 2.15 Tflops (Theoretical peak ) Large memory capacity 32GB/64 GB/node IBM BlueGene Solution, 10 racks 57.3 Tflops (Theoretical peak) Massive parallel system for Lattice QCD simulation
About 50 times faster than former Supercomputer system (Hitachi SR8000 100 node system)
ILDG and JLDG
ILDG: International Lattice DataGrid
International organization for data sharing Developing mark-up language, middleware Officially starts in June 2006 Several sites are already providing data:
LQA(Lattice QCD Archive)@Tsukuba Univ. Gauge Connection (NERSC, USA)
JLDG: Japan Lattice DataGrid
National community to share lattice data on HEPnet-J/sc Provides data to ILDG Developping file system and middleware (interface to ILDG)
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 28
Japan – U.S.A 10Gbps(To:N.Y.) 2.4Gbps(To:L.A) Japan - Singapore 622Mbps Janap – Hong Kong 622Mbps International lines 10Gbps Super SINET (32nodes) 100Mbps~1Gbps SINET (44nodes)
(Line speeds)
710 182 14 41 68 273 51 81 Total Others Inter-Univ. Res. Inst. Corp. Specialized Training Colleges Junior Colleges Private Public National
Number of SINET particular Organizations (2006.2.1)
- 6. Collaboration with NAREGI
Network Topology Map of SINET/SuperSINET(Feb. 2006)
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 29
SuperSINET
National Research Grid Initiatives (NAREGI) National Research Grid Initiatives (NAREGI) ⇒Prof. Matsuoka’s talk
- Apr. 2003 MEXT funded NAREGI 5 years Project
Lead by Prof. Ken Miura (NII) Development of Grid infrastructure and an application for promotion of national economy Target application is nano science and technology for new material design Players
Computing & networking: NII, AIST, TITEC Material scientists :IMS, U. Tokyo, Tohoku U., Kyushu U., KEK, .. Companies: Fujitsu, Hitachi, NEC
Distributed facility: Computing Grid up to 100 TFLOPS in total Extended to 2010 as a part of National Peta-scale Computing Project
IMS NII
10 TFLOPS(1,618 CPU) Application Testbed 5 TFLOS (896 CPU) Software Testbed As of 2004 Tokyo Nagoya
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 30
Industry/Societal Feedback International Infrastructural Collaboration
- Restructuring Univ. IT Research Resources
Extensive On-Line Publications of Results Management Body / Education & Training Deployment of NAREGI Middleware
Virtual Labs Live Collaborations
National Institute of Informatics
UPKI: National Research PKI Infrastructure
★ ★ ★ ★ ★ ★ ★ ☆
SuperSINET and Beyond: Lambda-based Academic Networking Backbone
Cyber-Science Infrastructure (CSI)
Hokkaido-U Tohoku-U Tokyo-U NII Nagoya-U Kyoto-U Osaka-U Kyushu-U
(Titech, Waseda-U, KEK, etc.)
NAREGI Outputs
GeNii (Global Environment for Networked Intellectual Information) NII-REO (Repository of Electronic Journals and Online Publications
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 31
- Prof. Ken Miura’s Summary at CHEP06
In the NAREGI project, seamless federation of heterogeneous resources is the primary objective Computations in Nano-science/technology applications
- ver Grid is to be promoted, including participation from
industry. Data Grid features has been added to NAREGI since FY’05. NAREGI Grid Middleware is to be adopted as one of the important components in the new Japanese Cyber Science Infrastructure Framework. NAREGI Project will provide the VO capabilities for the National Peta-scale Computing Project International Co-operation is essential.
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 32
KEK’s Commitment to CSI
NAREGI middleware
Current Data Grid features of NAREGI
not enough for High Energy application necessary to implement something like LFC
KEK can play an important role for this purpose.
Proposal to JST-CNRS Collaboration Program was approved.
For 3 years from 2006 Japanese side: NII, TITEC, Tsukuba U., Osaka U., AIST, KEK NAREGI
Interoperability with EGEE is a main objective.
KEK
in collaboration with Lyon Computing Center to establish an ILC Data Grid infrastructure to test the interoperability between NAREGI and EGEE middlewares.
Developing University PKI Scheme
KEK is working as a member of developing team.
ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 33
- 7. Summary