NGS Resource Broker Presented by Mike Mineter Slides from: Matthew - - PowerPoint PPT Presentation

ngs resource broker
SMART_READER_LITE
LIVE PREVIEW

NGS Resource Broker Presented by Mike Mineter Slides from: Matthew - - PowerPoint PPT Presentation

http://www.nesc.ac.uk/training http://www.ngs.ac.uk NGS Resource Broker Presented by Mike Mineter Slides from: Matthew Viljoen, STFC RAL Grid Deployment Group, RAL Grid Deployment Group, RAL http://www.grid-support.ac.uk/ Talk Outline Talk


slide-1
SLIDE 1

http://www.ngs.ac.uk http://www.nesc.ac.uk/training

NGS Resource Broker

Presented by Mike Mineter Slides from: Matthew Viljoen, STFC RAL Grid Deployment Group, RAL Grid Deployment Group, RAL http://www.grid-support.ac.uk/

slide-2
SLIDE 2

Talk Outline Talk Outline

  • Introduction & Background
  • What is a Resource Broker (RB)?
  • gLite WMS-LB
  • Future work

2

slide-3
SLIDE 3

Introduction Introduction

Grid Deployment Group, RAL

  • NGS Helpdesk
  • NGS Helpdesk
  • Services : CA, MyProxy, GSI-SSHTerm

– NGS: BDII, RB, Monitoring etc. NGS: BDII, RB, Monitoring etc. – EGEE: GOCDB, UKI ROC etc.

http://www.ngs.ac.uk support@grid-support.ac.uk pp @g pp

3

slide-4
SLIDE 4

RB Background RB Background

NGS d i 2004 – NGS started in 2004 – GT2.4 based middleware – Meanwhile EGEE deployed new Resource Broker – “Workload Management System” l hi h GS i h li – RB always high on NGS wish-list – gLite RB+UI: pre-production from 2007/03 – Core sites RB-compliant – RB: now available for users

4

slide-5
SLIDE 5

RB – What is it (and isn't)…

  • [ ] component to allow users to submit jobs and performs all tasks required to
  • [...] component to allow users to submit jobs and performs all tasks required to

submit them, without exposing the user to the complexities of the Grid¹

  • An interface to Grid resources
  • It can:

– choose the best resource to run your job

  • It enables:

It enables:

– resources to scale transparently – load balancing

  • It is not anything to do with the Storage Resource Broker (SRB)!
  • It is not anything to do with the Storage Resource Broker (SRB)!

5

slide-6
SLIDE 6

Before Before…

User User Nodes

Di t i t ti ith d

User User

  • Direct interaction with nodes
  • Need to know resource addresses, capabilities

6

slide-7
SLIDE 7

With a Resource B k Broker...

U U RB RB User User RB RB Nodes

  • User doesn’t care where jobs are run
  • Faster results
  • Easier, more scalable – get benefit of new nodes

7

, g

slide-8
SLIDE 8

NGS Resource Broker NGS Resource Broker

B d Lit WMS LB f EGEE

  • Based on gLite WMS-LB from EGEE
  • Can send jobs to other grids

Can send jobs to other grids

  • Can be used from:

– web portals (P-GRADE, in future NGS Portal) – from a User Interface (UI) by command line – from a User Interface (UI) by command line

8

slide-9
SLIDE 9

Enabling Grids for E-sciencE

Major components

I f i I f i

Input “sandbox”

“User “User interface” interface” Resource Resource Information Information Service Service

Output “sandbox”

interface interface Broker Broker A th

J

Author. &Authen.

Job Submi Job Qu Publ

Storage Storage

it Event uery ish

Logging & Logging & Element Element Computing Computing

INFSO-RI-508833 9

Logging & Logging & Book Book-

  • keeping

keeping Computing Computing Element Element

Job Status

slide-10
SLIDE 10

Sim ple Workflow Sim ple Workflow

f b 1. Log onto User Interface box 2. Write job description in JDL, + required files. Specify resource requirements 3. Submit job with glite-job-submit

<RB chooses best resource matching description>

4. Check status with glite-job-status

Waiting → Ready → Scheduled → Running → Done Waiting → Ready → Scheduled → Running → Done

5. Retrieve output with glite-job-output

10

slide-11
SLIDE 11

Sam ple JDL file Sam ple JDL file

Type = "Job"; Type = "Job"; JobType = "Normal"; JobType = "Normal"; yp yp Executable = "/usr/ngs/GAUSSIAN_G03_C02"; Executable = "/usr/ngs/GAUSSIAN_G03_C02"; StdInput = "/usr/local/applications StdInput = "/usr/local/applications chemistry/gaussian/g03_C02/g03/tests/com/test001.com"; chemistry/gaussian/g03_C02/g03/tests/com/test001.com"; StdOutput = "test001.out"; StdOutput = "test001.out"; StdError = "test001.err"; StdError = "test001.err"; OutputSandbox = {"test001.out", "test001.err"}; OutputSandbox = {"test001.out", "test001.err"}; RetryCount = 4; RetryCount = 4; RetryCount = 4; RetryCount = 4; ShallowRetryCount = ShallowRetryCount = -

  • 1;

1; Requirements = Member("GAUSSIAN_G03_C02", Requirements = Member("GAUSSIAN_G03_C02",

  • ther.GlueHostApplicationSoftwareRunTimeEnvironment) &&
  • ther.GlueHostApplicationSoftwareRunTimeEnvironment) &&

pp ) pp ) Mds Mds-

  • Computer

Computer-

  • platform == "i686" ;

platform == "i686" ;

11

slide-12
SLIDE 12

RB Resource m atching RB Resource m atching

  • To see the resources matching your JDL:

$>glite $>glite job job list list match match rank sample jdl rank sample jdl $>glite $>glite-job job-list list-match match – –-rank sample.jdl rank sample.jdl

****************************************************************** ****************************************************************** COMPUTING ELEMENT IDs LIST COMPUTING ELEMENT IDs LIST COMPUTING ELEMENT IDs LIST COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been The following CE(s) matching your job requirements have been found: found: *CEId* *R k* *CEId* *R k* *CEId* *Rank* *CEId* *Rank* grid grid-

  • data.man.ac.uk:2119/jobmanager

data.man.ac.uk:2119/jobmanager-

  • pbs

pbs-

  • router 0

router 0 grid grid-

  • data.rl.ac.uk:2119/jobmanager

data.rl.ac.uk:2119/jobmanager-

  • lsf

lsf-

  • normal

normal -

  • 92160

92160 grid grid-

  • compute.leeds.ac.uk:2119/jobmanager

compute.leeds.ac.uk:2119/jobmanager-

  • pbs

pbs-

  • router

router -

  • 428703

428703 grid grid-

  • compute.oesc.ox.ac.uk:2119/jobmanager

compute.oesc.ox.ac.uk:2119/jobmanager-

  • pbs

pbs-

  • router

router -

  • 4036455

4036455 ****************************************************************** ******************************************************************

12

slide-13
SLIDE 13

gLite-speak glossary gLite speak glossary

– WMS-LB ≈ Workload Management System and Logging and Bookkeeping System – CE (Computing Element) ≈ Queue on NGS node – SE (Storage Element) ≈ No equivalent yet on NGS – UI (User Interface) ≈ Machine with client tools installed – Information Service provides up-to-date status of p p resources (BDII)

13

slide-14
SLIDE 14

Future work Future work

  • MPI jobs support
  • Enable all NGS nodes to be RB-compliant

Enable all NGS nodes to be RB compliant

  • Make NGS Portal RB aware

U d t Lit 3 1? WM P ?? ( t

  • Upgrade to gLite3.1? WM-Proxy?? (parameter sweeps,

collective jobs)

  • Standardize job execution across NGS - role out NGS UEE

(Uniform Execution Environment)

  • Roll out more User Interfaces across UK?

14

slide-15
SLIDE 15

Sum m ary Sum m ary

RB smart & eas a of s bmitting jobs across

  • RB – smart & easy way of submitting jobs across

resources U i t t ith RB b P t l UI

  • User interacts with RB by Portal or UI
  • RB chooses best resource for your job
  • Great for running existing applications, not ideal for

developing your own applications (less interaction for p g y pp ( debugging etc.)

15

slide-16
SLIDE 16

Practical Practical

  • GSISSh to connect to ngsui01.ngs.rl.ac.uk
  • Follow link in the agenda page (or jump to

http://wiki.ngs.ac.uk/index.php?title=Resource Broker http://wiki.ngs.ac.uk/index.php?title Resource_Broker_ Tutorial )

16