Grid Activities at High Energy Accelerator Research Organization - - PowerPoint PPT Presentation

grid activities
SMART_READER_LITE
LIVE PREVIEW

Grid Activities at High Energy Accelerator Research Organization - - PowerPoint PPT Presentation

Grid Activities at High Energy Accelerator Research Organization (KEK) May 04. 2006 KEK Computing Research Center Setsuya Kawabata ISGC2006 at ASGC Outline 1. LCG Testbed Collaboration for Grid deployment with ICEPP Testbed 2.


slide-1
SLIDE 1

Grid Activities

at

High Energy Accelerator Research Organization (KEK)

May 04. 2006 KEK Computing Research Center Setsuya Kawabata ISGC2006 at ASGC

slide-2
SLIDE 2

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 2

Outline

  • 1. LCG Testbed

Collaboration for Grid deployment with ICEPP Testbed

  • 2. Grid CA at KEK
  • 3. Belle experiment

Status of Experiment Data analysis Activity for Grid New B Factory Computer System

  • 4. Deployment Plan

New Central Information System

  • 5. Lattice QCD Grid

New Super Computer System at KEK

  • 6. Collaboration with NAREGI
  • 7. Summary
slide-3
SLIDE 3

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 3

  • 1. LCG Test bed

Since Nov. 2001, KEK and ICEPP have collaborated to study and experience the function of the Regional center facility.

experienced NorduGrid before LCG available. HPSS performance test in NorduGrid environment . . .

ATLAS Tier-2 Center at ICEPP, U. of Tokyo

Major facility will be installed in FY.2006. ⇒Prof. Sakamoto’s talk

KEK LCG test bed developed since Sep. 2005

In collaboration with ICEPP Learning and technical skill update on the LCG middleware (LCG 2.6 -) Implementation of Atlas software for muonTrigger Simulation Organized LCG installation training course 17-19 Nov. 2005 at KEK

slide-4
SLIDE 4

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 4

LCG-2.6 Testbed

  • Functional test of middleware
  • Performance measurement of data sharing
  • Atlas simulation software was installed to

demonstrate regional resource sharing

  • Parallel processing of Geant4 simulator with

MPI

UI CE_torque SE_dpm, LFC_mysql BDII, RB, MON, PX VOMS WN1 UI-VOMS / WN1 CE_torque / WN2 SE_dpm, MON BDII / WN3 PX / WN4 RB / WN5 LFC_mysql / WN6 WN2 CE_lsf

ICEPP/U. Tokyo KEK VO: Atlas_j, rcrd VO: Atlas_j, g4med PC farm ~ 50 CPU SuperSINET(1Gbps) Dual cpu pc’s JST Cluster

Private network

Opteron Dual x 20 nodes

slide-5
SLIDE 5

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 5

  • 2. KEK Grid CA

KEK submitted a GRID CA application to APGRIDPMA in Nov. 2005. KEK Grid CA was approved by APGRID PMA and in production since January 2006.

3rd production GRID CA in Japan NAREGI CA software was modified to use at KEK. https://gridca.kek.jp/

KEK employees and their collaborators are eligible for this service.

slide-6
SLIDE 6

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 6

  • 3. Belle Experiment

Belle Exp.

B meson Factory using world highest luminosity accelerator: KEKB e+e- Collider

Accelerator:

Luminosity:

Peak 1.627×10 34 cm -2 s -1 e+ (3.5GeV) ~ 2.0 A ; e- (8GeV) ~ 1.36A

Continuous injection from Linac Improved banch-banch interference and electron cloud Luminosity will be much more improved by Crab Cavity in 2006.

slide-7
SLIDE 7

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 7

Belle Detector

μ / KL detection 14/15 lyr. RPC+Fe Tracking + dE/dx small cell + He/C2H5 CsI(Tl) 16X0 Aerogel Cherenkov cnt. n=1.015~1.030 Si vtx. det. 3 lyr. DSSD TOF counter SC solenoid 1.5T 8GeV e− 3.5GeV e+

slide-8
SLIDE 8

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 8

Integrated Luminosity Trend

“Super-KEKB“

L= 1036 Present KEKB L= 1.5x1034

2002 03 04 05 08 07 06 09 10 11

∫dt = 1,000fb−1

larger beam current smaller βy* long bunch option crab crossing

Constraint: 8GeV x 3.5GeV wall plug pwr.< 100MW crossing angle< 30mrad

∫dt (fb-1)

1,50 0 1,0 0 0 50 0

slide-9
SLIDE 9

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 9

Experimental Data and its Sharing among Belle Collaboration by SRB Data accumulated so far

1.5 PB including Simulation data Recent data acquisition rate ~ 1.0TB/day

SRB servers for real data storage system has been implemented in Aug. 2005. Current active data sharing

among KEK, U. Melbourne (Australia), and Nagoya Univ. Target storage space 120 TB

files registered to MCAT ~ 423 files as of Sep. 6

slide-10
SLIDE 10

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 10

The Belle SRB system

HDD MCAT KEK network Tape Library NFS HSM-DISC HSM Server SRB server Router MCAT

federation

MES server MES server at ANU Internet Belle FW KEK FW TOHOKU Univ. SRB server SRB server at Melbourn-Univ DB server MCAT SRB server MES server DB server NAGOYA Univ. Super SINET RAID

Australia

slide-11
SLIDE 11

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 11

Pre-production LCG site for Belle

(JP-KEK-CRC-01)

Pre-production site was built up with LCG2.7 March, 2006 Certification by APROC for registration to GOCDB has been done at the end of March New VO: Belle has been registered to the LCG/EGEE as a global VO. Initial collaboration sites expected:

Melbourne, ASGC, Krakow, Jozef Stefan Institute (Slovenia), IHEP Vienna Nagoya U.

slide-12
SLIDE 12

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 12 Registory RB CE SE MON VOMS UI PX/BDII LFC CE Disk Server

LCG2.7 Node

Site: JP-KEK-CRC-01 VO: Belle, dteam

Not ready

Service web site: http://hepdg.cc.kek.jp/service/.

WN Farm-1 WN Farm-2 .keklcg.jp

192.168.1.0

1.4 TB RAID 8-03-2006

DNS,NAT ce01.keklcg.jp(192.168.1.1) se01.keklcg.jp(192.168.1.2) mon.keklcg.jp(192.168.1.3) ce02.keklcg.jp(192.168.1.4)

(Not ready)

wn001~wn014

Internet Internet

Router/SW

slide-13
SLIDE 13

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 13

New B Factory Computer System

80+16FS 11 3+(9)

Work Group server (# of hosts )

128PC 23WS +100PC 25WS +68X

User Workstation (# of hosts)

3,500 (3.5PB) 620 160

Tape Library Capacity (TB)

1,000 (1PB) ~9 ~4

Disk Capacity (TB)

~42,500

(PC)

~1,250

(WS+PC)

~100

(WS) Computing Server (SPECint2000 rate)

2006-(6years) 2001-(5years) 1997-(4years)

Performance\Year Moore’s Law 1.5y=twice 4y=~6.3 5y=~10

New B Factory Computer System since March 23. 2006 History of B Factory Computer System

slide-14
SLIDE 14

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 14

KEK-DMZ 130.87.104.0/22(*.kek.jp) Login Web Internet Internet KEK-FB 130.87.224.0/21(*.kek.jp) RADIUS Web NTP・DNS・SMTP

80

WG SUS

KEK FireWall L3 SW

130.87.192.0/24(*.kek.jp)

128

UWS SQL APP CVS SRB

10

Grid ITA Windows LDAP

16

WF

5

BK 172.22.28.0/24

Core SW

1140

SC 172.22.32 - 43.0/24 10.34.32 - 43.0/24 Huge Storage System High-Speed Data transfer system DAQ Server @ Exp. Hall

SC :Computing Server WG :Work Group Server BK :Backup Server WF :Work File Server UWS :User Work Station SUS :Software Update Server ITA :Administration Server

LSF

Administration Net Work Group Server Net Data Transfer Network Outer net SW

Super- SINET Super- SINET

とう

Nagoya,Tohoku,Tokyo I nst,Tokyo Univ. GRID Server AI ST 130.87.194.0/24 KEK-BC Internet Internet

New B computer system

Belles Farm 172.17.X.X

KEK FireWall

slide-15
SLIDE 15

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 15

New B Factory Computer System

Computing Server (CS)

CS+WG servers(80) = 1208 nodes=2416CPU =45,662 SPEC CINT 2k rate =8.7THz DELL Power Edge 1855 Xeon3.6GHz x2, memory 1GB Linux (CentOS/CS,REL/WGS) 1 Enclosure = 10 nodes/ 7U space 1 rack = 50 nodes 25 racks = 4 arrays

slide-16
SLIDE 16

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 16

New B Factory Computer System

Storage System (SS) –disk-

1,000TB, 42FileServ. Nexan + ADTeX +

SystemWks SATAII 500G dr.

×~2000

(~1.8 failures/day?)

HSM = 370TB non HSM (no Bck) = 630TB

← Nexan SATABeast 42dr/4U/21TB ←ADTeX Array Master 15dr/3U/8TB

slide-17
SLIDE 17

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 17

Storage System (SS)

  • tape-

HSM

3.5PB + 60drv + 13srv SAIT 500GB/volume 30MB/s drive Petaserv(SONY)

WFS backup

90TB + 12drv + 3srv LTO3 400GB/volume NetVault

New B Factory Computer System

slide-18
SLIDE 18

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 18

New B Factory Computer System

System usage

User Workstations (PC) are used for the network terminal. Users login to Work Group Server(WGS; 4~5persons/host). 1208 Computing Servers divided into 3 LSF (Batch System) clusters. WGS(80svr) shares WFS(16srv+80TB) as NFS user home directories. Not-so-frequently-modified applications/Libraries are hold in 50 NFS

  • servers. 1140 CS shares the NFS server.
  • Exp. Data (Many and Big size)

is transferred between CS(1140) and Storage Servers(42) by using a Belle self-made simple TCP/socket application. Data is managed by cooperation with DB system.

Storage System ⇔Computing server transfer performance spec.

CS/WGS 1/3(540) ⇔ SS = 10GB/s CS 2/3(1080) ⇔ SS/HSM = 0.5GB/s SS/HSM ⇔ SS/nonHSM = 0.5GB/s

slide-19
SLIDE 19

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 19

  • 4. LCG Deployment plan at KEK

New Computer Systems

Central Information System since Feb. 20. ’06 B Factory Computer System since Mar. 23. ‘06

1st Phase

LCG and SRB for production usage are available on the Grid System in the new Central Information System. Not for public usage, but for supporting projects Under system maintenance in contract with IBM-Japan WN: 36 nodes x 2 =72 CPU Storage: Disk (2TB) + HPSS(~200TB) Supported VO: Belle, APDG, Atlas_J Service start in ~ May 2006

2nd Phase

Full support in the Belle production system

slide-20
SLIDE 20

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 20

Central Computing System : KEKcc

Work Server, Computing Server

Program development, Job submission, etc. AMD Opteron252 2.6GHz Dual Red Hat Enterprise Linux Platform LSF HPC

Software

CERNLIB, Geant4, CLHEP and ROOT Monte Carlo Simulation codes

Storage System

Disk storage 45 TBytes, HPSS 200TBytes

Mail System : KEKmail

PostKEK (for Research div.), MailKEK (for administration div.) ML, Anti-SPAM, anti-Virus

Web Systems

KEK Official system, Researchers’ system, Conference system

Grid System

LHC data Grid LCG, Storage Resource Broker SRB

Central Information System

From 2006 February 20th

slide-21
SLIDE 21

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 21

KEK Central Information System

DMZ

LAN

Data-LAN

FW Access Server

IBM xSeries 336 x 16 CPU: Intel Xeon 3GHz MEM:1GB

The Internet/SINET

Batch-LAN

Computing Server

IBM eServer 326 x 64 CPU:AMD Opteron252 2.6GHz x2 MEM:4GB Admin-LAN

Work Server

IBM eServer 326 x 12 CPU:AMD Opteron252 2.6GHz x2 MEM:4GB

Linux Configuration Server

IBM xSeries 345 x 1 CPU: Intel Xeon3GHz MEM: 1GB MAIL-LAN

Web System

IBM pSeries 520 x 2 CPU: 2-way 1.5 GHz POWER5 MEM: 8GB

Agenda Web

IBM xSeries 336 x 2 CPU: Intel Xeon 3GHz MEM: 1GB

GRID-LAN

Mail System(POST/MAIL)

IBM pSeries 570 x 2 IBM pSeries 510 x 2 IBM xSeries 336 x 16 IBM DS4300/DS4100

GRID System

LCG System IBM xSeries 336 x 10 IBM eServer 326 x 36 IBM pSeries 520 x 1 SRB System IBM xSeries 255 x 1 AFS System IBM xSeries 336 x 3 Library System

IBM pSeries 520 x 2

Software Server

IBM pSeries 520 x 1

Print Server and Printers

IBM eServer 226 x 2 Fuji XEROX DocuPrint C2426 x 26 Fuji XEROX DocuPrint C3140 x 9 FXPS Phaser 8550DP x 11 HP Designjet 5500psUV x 1

NTP Server

IBM eServer 336 x 3 SEIKO TS2530(GPS) x 2

PC Configuration System

IBM xSeries 336 x 2 /346 x 5

X Terminal

IBM IntelliStation M Pro Model3J9 x 10

Shared PC

IBM Thinkpad R52 x 15 Apple PowerBookG4 x 5

Web Staging Server

IBM pSeries520 x 2 CPU: 2-way 1.5 GHz POWER5 MEM: 8GB

Central Computing System (KEKCC)

Development System

IBM pSeries 520/510 IBM xSeries 336/346/236 IBM eServer 326 IBM DS4100 3590 Tape Drive

server pSe ries IBM H C R U6 I B M server IB M ser ver p S e ri e s server pSerie s IBM H C R U6 IB M server

GRID CA Access Server

IBM xSeries 226 x1

GRID CA Server

IBM xSeries 226 x1

Disk Storage

IBM pSeries520 x 7 CPU: 2-way 1.65 GHz POWER5 MEM:4GB/1GB IBM DS4300(45TB) IBM 3584 Tape Drive

HPSS

IBM pSeries 520 x4 /550 x3 CPU: 2-way 1.65 GHz POWER5 MEM:2GB/4GB IBM xSeries 346 x1 CPU: Intel Xeon3GHz x2 MEM: 2GB IBM DS4300(10TB) IBM 3494 Tape Drive(320TB)

slide-22
SLIDE 22

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 22

KEK CRC Production GRID Overview

  • LCG System - (JP-KEK-CRC-02)

LCG CE (SLC3) rlc01~rlc12 File server + DISK (AIX5L 5.3) HPSS LCG Computing Eelements (SLC3) rlc13~rlc36 CE2 CE1 WN WN SE (Classic) tests RB/BDII PX/BDII VOMS LFC UI MON/GridICE rls09 rls10 GRID LCG servers (Scientific Linux CERN) rls01~10

IBM xSeries 336 ( 10 Unit )

  • EM64T Xeon 3.0GHz x 1
  • 1024MB RAM
  • 36.4GB HDD x 2 (RAID-1)
  • Gigabit Ethernet x 2 + 1

OS : Scientific Linux CERN 3.0.6

IBM xSeries 326 ( 36 Unit )

  • AMD Opteron 262 2.6GHz x 2
  • 4096MB RAM
  • 36.4GB HDD x 2 (RAID-1)
  • Gigabit Ethernet x 2

OS : Scientific Linux CERN 3.0.6

2TB

200TB

slide-23
SLIDE 23

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 23

KEK CRC Production GRID Overview

  • SRB System -

HPSS 200TB GRID LCG file server + DISK (AIX5L 5.3)

NFS/VFS

SRB server rsr01 srbcert RP SRB Master PostgreSQL UI rac01 SRB Client SRB Server NFS HOME

/home /work

for MCAT

HSI (API)

slide-24
SLIDE 24
  • 5. Lattice QCD Data Grid

Lattice QCD data In lattice QCD, gauge field configuration is essential data Generated by Monte Carlo algorithms

Full QCD requires large computational resources

Once generated, various correlation functions can be measured

Hadron spectra, decay constants, matrix elements, etc. Exotic hadrons, interaction between hadrons

Data size

Degree of freedom: SU(3)×site(x,y,z,t)×direction(4), statistics O(1000)

Various actions, sizes, parameters

Extrapolations to continuum, small quark mass, large volume limits Comparison for consistency check

Sharing gauge configuration is now world movement

slide-25
SLIDE 25

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 25

HEPnet-J/sc HEPnet-J/sc

Domestic network for HEP theory SuperSINET (NII) Connecting measure lattice QCD sites KEK, Tsukuba, Osaka, Kyoto, Kanazawa, Hiroshima File mirroring

slide-26
SLIDE 26

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 26

New Supercomputer System

From 2006 March 1st

Large Scale Simulation for particle and nuclear physics research and accelerator-related scientific studies

Hitachi SR11000 K1 System, 16 nodes 2.15 Tflops (Theoretical peak ) Large memory capacity 32GB/64 GB/node IBM BlueGene Solution, 10 racks 57.3 Tflops (Theoretical peak) Massive parallel system for Lattice QCD simulation

About 50 times faster than former Supercomputer system (Hitachi SR8000 100 node system)

slide-27
SLIDE 27

ILDG and JLDG

ILDG: International Lattice DataGrid

International organization for data sharing Developing mark-up language, middleware Officially starts in June 2006 Several sites are already providing data:

LQA(Lattice QCD Archive)@Tsukuba Univ. Gauge Connection (NERSC, USA)

JLDG: Japan Lattice DataGrid

National community to share lattice data on HEPnet-J/sc Provides data to ILDG Developping file system and middleware (interface to ILDG)

slide-28
SLIDE 28

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 28

Japan – U.S.A 10Gbps(To:N.Y.) 2.4Gbps(To:L.A) Japan - Singapore 622Mbps Janap – Hong Kong 622Mbps International lines 10Gbps Super SINET (32nodes) 100Mbps~1Gbps SINET (44nodes)

(Line speeds)

710 182 14 41 68 273 51 81 Total Others Inter-Univ. Res. Inst. Corp. Specialized Training Colleges Junior Colleges Private Public National

Number of SINET particular Organizations (2006.2.1)

  • 6. Collaboration with NAREGI

Network Topology Map of SINET/SuperSINET(Feb. 2006)

slide-29
SLIDE 29

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 29

SuperSINET

National Research Grid Initiatives (NAREGI) National Research Grid Initiatives (NAREGI) ⇒Prof. Matsuoka’s talk

  • Apr. 2003 MEXT funded NAREGI 5 years Project

Lead by Prof. Ken Miura (NII) Development of Grid infrastructure and an application for promotion of national economy Target application is nano science and technology for new material design Players

Computing & networking: NII, AIST, TITEC Material scientists :IMS, U. Tokyo, Tohoku U., Kyushu U., KEK, .. Companies: Fujitsu, Hitachi, NEC

Distributed facility: Computing Grid up to 100 TFLOPS in total Extended to 2010 as a part of National Peta-scale Computing Project

IMS NII

10 TFLOPS(1,618 CPU) Application Testbed 5 TFLOS (896 CPU) Software Testbed As of 2004 Tokyo Nagoya

slide-30
SLIDE 30

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 30

Industry/Societal Feedback International Infrastructural Collaboration

  • Restructuring Univ. IT Research Resources

Extensive On-Line Publications of Results Management Body / Education & Training Deployment of NAREGI Middleware

Virtual Labs Live Collaborations

National Institute of Informatics

UPKI: National Research PKI Infrastructure

★ ★ ★ ★ ★ ★ ★ ☆

SuperSINET and Beyond: Lambda-based Academic Networking Backbone

Cyber-Science Infrastructure (CSI)

Hokkaido-U Tohoku-U Tokyo-U NII Nagoya-U Kyoto-U Osaka-U Kyushu-U

(Titech, Waseda-U, KEK, etc.)

NAREGI Outputs

GeNii (Global Environment for Networked Intellectual Information) NII-REO (Repository of Electronic Journals and Online Publications

slide-31
SLIDE 31

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 31

  • Prof. Ken Miura’s Summary at CHEP06

In the NAREGI project, seamless federation of heterogeneous resources is the primary objective Computations in Nano-science/technology applications

  • ver Grid is to be promoted, including participation from

industry. Data Grid features has been added to NAREGI since FY’05. NAREGI Grid Middleware is to be adopted as one of the important components in the new Japanese Cyber Science Infrastructure Framework. NAREGI Project will provide the VO capabilities for the National Peta-scale Computing Project International Co-operation is essential.

slide-32
SLIDE 32

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 32

KEK’s Commitment to CSI

NAREGI middleware

Current Data Grid features of NAREGI

not enough for High Energy application necessary to implement something like LFC

KEK can play an important role for this purpose.

Proposal to JST-CNRS Collaboration Program was approved.

For 3 years from 2006 Japanese side: NII, TITEC, Tsukuba U., Osaka U., AIST, KEK NAREGI

Interoperability with EGEE is a main objective.

KEK

in collaboration with Lyon Computing Center to establish an ILC Data Grid infrastructure to test the interoperability between NAREGI and EGEE middlewares.

Developing University PKI Scheme

KEK is working as a member of developing team.

slide-33
SLIDE 33

ISGC2006 on May 04 2006 S.Kawabata: "Grid Activities at KEK" 33

  • 7. Summary

KEK Grid CA was approved by APGRID PMA and in production since January 2006 Three New Computer systems

Central Information System B Factory Computer System Supercomputer System

are successfully in operation. LCG and SRB for production usage are available on KEK Central Information System. Japan Lattice Data Grid (JLDG) project starts its middle ware development, which will utilize KEK Supercomputer System. KEK has started a collaboration with NAREGI group to make NAREGI middleware interoperable with gLite.