Grid Computing: NetSolve and the GrADS Project GrADS Project - - PDF document

grid computing
SMART_READER_LITE
LIVE PREVIEW

Grid Computing: NetSolve and the GrADS Project GrADS Project - - PDF document

Grid Computing: NetSolve and the Grid Computing: NetSolve and the GrADS Project GrADS Project


slide-1
SLIDE 1
  • Grid Computing:

Grid Computing: NetSolve and the

NetSolve and the GrADS Project GrADS Project

slide-2
SLIDE 2
  • Innovative Computing Laboratory

Innovative Computing Laboratory

♦ ♦ ♦ !" ♦ #$

!%%! &'(%))) *+"%("%,#-,"%# *. (/ 0 ( 1 1 *23

"4&5,% 617$171

  • Outline

Outline

♦ / ♦ !%/

18

9 /- ! / #:- !%

# /

slide-3
SLIDE 3
  • Grid Computing

Grid Computing

♦ $2;

<=3%% % >% 11%)

♦ "2#1<

?11 113 @!A )

  • Why Grids?

Why Grids?

♦ B!

  • ♦ #!
  • ♦ !

!%

slide-4
SLIDE 4
  • Technology Trends:

Technology Trends: Microprocessor Capacity Microprocessor Capacity

2X transistors/Chip Every 1.5 years

Called “Moore’s Law”

Moore’s Law

Microprocessors have become smaller, denser, and more powerful. Not just processors, bandwidth, storage, etc

Gordon Moore (co-founder of Intel) predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.

  • Network Bandwidth Growth

Network Bandwidth Growth

♦ !) *0% !C% D. ♦ *C0+E''' (8.'' !(8F&'1''' ♦ E''*E'*' (8+' !(8&'''

5A!))) !"#$"%#$ &"%&"$%!$" '$

slide-5
SLIDE 5
  • Bandwidth Won’t Be A Problem Soon

Bandwidth Won’t Be A Problem Soon --

  • Bisection Bandwidth (BB) Across the US

Bisection Bandwidth (BB) Across the US

♦ *CG*- HH**EI, ♦ *C0+- HH*5, ♦ E''*- HHE''/, ♦ %1&'''

% %%*'/,

♦ *E

&'''J*'/, &',

♦ .!

% %!,E*E E)&#,

♦ ;K%%!

%A 1% % %

  • =

/%" E''' ♦ C

%

♦ 6*''. ♦ HH!!

*E''')

  • Grid Possibilities

Grid Possibilities

♦ %8*'1'''

*''1'''%

♦ *1'''%!! ♦ 181?

<%8

♦ <11?<

1!%1

slide-6
SLIDE 6
  • Some Grid Usage Models

Some Grid Usage Models

♦ (:

%/!% 1

♦ K!(%<%

  • 82))
  • <-3

♦ 1!% ♦ 5-(

  • Grid Usage Models

Grid Usage Models

1 %%/(

  • %
  • ♦ H/!
  • 1!

!

slide-7
SLIDE 7
  • Example Application Projects

Example Application Projects

♦ $% /(2 7$3 ♦ $/(%11)2$3 ♦ $/(2$3 ♦ 62 7$3 ♦ /(%1)2$3 ♦ /#%!2 63 ♦ 5$7 (<2 63 ♦ $$ (2 63 ♦ ##%/2 7$3

  • Some Grid Requirements

Some Grid Requirements – – Systems/Deployment Perspective Systems/Deployment Perspective

♦ ?% ♦ %<? ♦ " ♦ "%< ♦ " ♦ 2-31!! ♦ % ♦ " ♦ %- ♦ # ♦ 5 ♦ ♦ ♦ " ♦ ? ♦ 6 ♦ ♦ $)

slide-8
SLIDE 8
  • Some Grid Requirements

Some Grid Requirements – – User Perspective User Perspective

♦ -(%

/% %

♦ (%

/

♦ (

/

♦ (/

%!

  • The Systems Challenges:

The Systems Challenges: Resource Sharing Mechanisms That… Resource Sharing Mechanisms That…

!

♦ 8%!%

%

♦ 1

1

  • ♦ 7!%!%

?

slide-9
SLIDE 9
  • The Security Problem

The Security Problem

♦ "8

?%8

  • ♦ "
  • $%%!?

♦ %

11,

  • :,

♦ ?

1!-1!- !%!

  • The Resource Management Problem

The Resource Management Problem

♦ $1

  • %%<

"?%< "

slide-10
SLIDE 10
  • Grid Systems Technologies

Grid Systems Technologies

!?) $))1/(

/ 2/ 3

  • /5 25 3
  • /"52/"53
  • "?-

/6#1H#

  • The Programming Problem

The Programming Problem

♦ !1

1- 1%1/L

♦ #( ,,)

  • ,% !

%

slide-11
SLIDE 11
  • Examples of Grid

Examples of Grid Programming Technologies Programming Technologies

♦ 5#-/E(/-

  • ♦ / I1/#(#1
  • %

♦ /5#1/1 "H(

1

♦ -/(!! ♦ (:/ ♦ (! ♦ (/-!

!

1

  • MPICH

MPICH-

  • G2: A Grid

G2: A Grid-

  • Enabled MPI

Enabled MPI

♦ %5

#25#3%1 !

H%5# 5#2/ 3 ♦ /%1

1811)

♦ #!!%% ♦ (55#1#M1 5#1

5/#$

slide-12
SLIDE 12
  • Grid Events

Grid Events

♦ //6(! 5F,1) )- $1!%:

  • ♦ #(:

#-** !%//6-01 E''E ♦ 7% ## 1/1$/1/ "

  • Useful References

Useful References

♦ H25I3 !!!)), ♦ #/ ;%%/($ N7<=1 #1E''* !!!)),%,, ) ♦ "%%

1(

!!!))1!!!)- )1!!!))

slide-13
SLIDE 13
  • Emergence of Grids

Emergence of Grids

♦ H/%%

  • 2!%%%!%5#

3

( !,

  • 11

1 %2% 113 23

  • Grids Are Inevitable

Grids Are Inevitable

♦ 2#3( !

  • 8

1% %A !! O: 1@A!

slide-14
SLIDE 14
  • ♦ %8

%)

♦ K%!

%% )

♦ %%181

%% )

♦ !11

%% %%- 1)))

"

In the past: Isolation Motivation for Grid Computing Motivation for Grid Computing

Today: Collaboration

  • What is Grid Computing?

What is Grid Computing?

"%? 1- <

QuickTime™ and a decompressor are needed to see this picture. QuickTime™ and a decompressor are needed to see this picture.

IMAGING INSTRUMENTS COMPUTATIONAL RESOURCES LARGE-SCALE DATABASES DATA ACQUISITION ,ANALYSIS ADVANCED VISUALIZATION

slide-15
SLIDE 15
  • The Grid Architecture Picture

The Grid Architecture Picture

  • High speed networks and routers
  • !

" # $%

  • "
  • !&

%'

  • Globus

Globus Grid Services Grid Services

♦ %/

/

111 11))) ♦ %% 18% # ♦ 6%%!-# ♦ 8 $))1#1/ -#1M).'C1))) ♦ PA/1A

8

slide-16
SLIDE 16
  • Evolution of a Community Grid Model

Evolution of a Community Grid Model

♦ "! K

1%1

  • Common Infrastructure layer

(NMI, GGF standards, OGSA etc.) Grid Resources User-focused grid middleware, tools, and services Applications

  • Maturation of Grid Computing

Maturation of Grid Computing

♦ "%

  • 5!
  • /<

♦ 1%1%

%

♦ /-

  • /#% 2#%31H"231N7231

2#%31Q

slide-17
SLIDE 17
  • The Computational Grid is…

The Computational Grid is…

♦ Q%

! )

♦ #!/

#!(%1!1!1

  • #!(

♦ !!%/%!

!%! )

%- B

  • Computational Grids and

Computational Grids and Electric Power Grids Electric Power Grids

♦ K%%

/ %$ #!/

("$$ $)$ *’

  • %
  • +$

♦ K%%

/ % $#!/

,$

  • ,$

+$

  • +

"$$

.$ ' .$"$$"

slide-18
SLIDE 18
  • An Emerging Grid Community

An Emerging Grid Community

*CC.-E'''

♦ ;/=

%!% %

! " /$ ! 0."+%0$ .1 2 0, .+$

  • /.%3
  • #/ - %(,,)),R!:,%,#/

/%(,,!!!)),

  • %(,,!!!))),R%!,

%(,,!!!-)),,%, %(,,!!!))),, 6%(,,%))):,,

  • %(,,!!!))!),,

/ %(,,!!!)%%)),%%,,)% K6! %(,,!!!))),,,

  • %(,,))),

Grids are Hot Grids are Hot

slide-19
SLIDE 19
  • ♦ K

/ Broad Acceptance of Grids as a Critical Broad Acceptance of Grids as a Critical Platform for Computing Platform for Computing

NSF’s Cyberinfrastructure NASA’s Information Power Grid DOE’s Science Grid

  • Broad Acceptance of Grids as a Critical

Broad Acceptance of Grids as a Critical Platform for Computing Platform for Computing

♦ K

/

  • ♦ H51 1$11

#1Q

On August 2, 2001, IBM announced a new corporate initiative to support and exploit Grid computing. AP reported that IBM was investing $4 billion into building 50 computer server farms around the world.

AVAKI

slide-20
SLIDE 20
  • Grids Form the Basis of a National

Grids Form the Basis of a National Information Infrastructure Information Infrastructure

/ !

  • 4

*F)+ 4 7+'' 4 &'! 4 #!-

4 !$$"$%$%+$""$% August 9, 2001: NSF Awarded $53,000,000 to SDSC/NPACI and NCSA/Alliance for TeraGrid

slide-21
SLIDE 21

“Grids Meet Peer Grids Meet Peer-

  • to

to-

  • Peer”

Peer”

♦ /#E#% /(111

  • #E#(1<11
  • ♦ 6!

%

1-11-

  • ♦ I(8%

<

  • Peer to Peer Computing

Peer to Peer Computing

♦ #--!!%%

!% %%)

♦ K ♦ %18%

!%)

♦ K ♦ %

%)

H111

"11

slide-22
SLIDE 22
  • Internet On Everything

Internet On Everything

  • Distributed Computing

Distributed Computing

♦ %!

  • ♦ H(%
  • 58<<

5< ♦ 8

  • %

6

slide-23
SLIDE 23
  • Examples of Distributed Computing

Examples of Distributed Computing

♦ K11) /% ♦ $S% :1$1) 7O % ♦ 1/(,% ♦ " 1%/

  • SETI@home

SETI@home: Global Distributed Computing : Global Distributed Computing

♦ ".''1'''#1R*'''#

P

&0.10E*#P

♦ %?

#

"

slide-24
SLIDE 24
  • SETI@home

SETI@home

♦ %-

#% %% 8 )

♦ !%

% " 1#"

♦ K%%

!% !!! F''% )

♦ %%

% $ 1!% %% )

: 8

R&''1'''% EG, ♦

%)

  • Grid Computing

Grid Computing -

  • from ET

from ET toAnthrax toAnthrax

slide-25
SLIDE 25
  • Distributed and Parallel Systems

Distributed and Parallel Systems

Distributed systems hetero- geneous Massively parallel systems homo- geneous

G r i d b a s e d C

  • m

p u t i n g

Beowulf cluster Network of ws

C l u s t e r s w / s p e c i a l i n t e r c

  • n

n e c t

Entropia/UD Earth Simulator

♦ /%23 ♦

K ♦ K ♦ *'T- E'T%7I ♦ " ♦

  • %

♦ $S%

  • R&''1'''%
  • EG,

♦ H ♦ ! ♦

K%! ♦ .T%8 ♦ %B ♦ "- ♦

  • %

♦ $%

  • .'''
  • F.,

SETI@home Parallel Dist mem ASCI Tflop/s

slide-26
SLIDE 26
  • Motivation for NetSolve

Motivation for NetSolve

♦ - ♦ -%% ♦ H6 ♦ $ ♦ 5 ♦ H

  • !"#$%&'(

!"#$%&'( !"#$%&'( !"#$%&'(

  • NetSolve Network Enabled Server

NetSolve Network Enabled Server

♦ 8/

%!,!,)

♦ H"#

!%Q

1 11 %11Q ♦ $-- ♦

)

slide-27
SLIDE 27
  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

S1 S2 S3 S4

Client

Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database

No knowledge of the grid required, RPC like.

IBP Depot

  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

S1 S2 S3 S4

Client

Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database

No knowledge of the grid required, RPC like. A, B

IBP Depot

slide-28
SLIDE 28
  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

S1 S2 S3 S4

Client

Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database

No knowledge of the grid required, RPC like.

H a n d l e b a c k

IBP Depot

  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

S1 S2 S3 S4

Client Answer (C)

S2 ! Request

Op(C, A, B)

Matlab, Octave, Scilab Mathematica C, Fortran Schedule Database

No knowledge of the grid required, RPC like. A, B OP, handle

IBP Depot

slide-29
SLIDE 29
  • Hiding the Parallel Processing

Hiding the Parallel Processing

♦ !

  • ♦ %%

11 %)

  • Basic Usage Scenarios

Basic Usage Scenarios

♦ /

  • A%%

!% %1#I1 1 #I1#$ 1U$1 "#I ♦ ;#=8 # ♦ "8 !%

  • ♦ ;H=/H
  • B

!!

  • 8

% %1 ;=B 1E'' C

♦ /1

161Q

slide-30
SLIDE 30
  • NetSolve Agent

NetSolve Agent

♦ %

)

♦ B% %!!) ♦ "% % % %

Agent

  • NetSolve Agent

NetSolve Agent

♦ " %2A3( # #2#I3) !!%1) !) #<,%8) ;)=% ) )

Agent

slide-31
SLIDE 31
  • ♦ 6H)

A # )

♦ 161

5171 1 5%)

♦ 7B!) ♦

%(1- 11Q

NetSolve Client NetSolve Client

Client

  • NetSolve Client

NetSolve Client

♦ ) ♦ 558))( D2H13O

A = netsolve(‘matmul’, B, C);

4 Possible parallelisms hidden.

  • In Matlab:

!"#$%%&' ()*+%!, &'

Client

slide-32
SLIDE 32
  • NetSolve Client

NetSolve Client

) B) ) ) )

)

Client

  • Generating New Services in NetSolve

Generating New Services in NetSolve

♦ % /!

  • Java GUI

NetSolve Parser/ Compiler

@PROBLEM degsv @DESCRIPTION This is a linear solver for dense matrices from the LAPACK

  • Library. Solves Ax=b.

@INPUT 2 @OBJECT MATRIX DOUBLE A Double precision matrix @OBJECT VECTOR DOUBLE b Right hand side @OUTPUT 1 @OBJECT VECTOR DOUBLE x …

Server

Service Service Service Service New Service

New Service Added!

slide-33
SLIDE 33
  • Task Farming

Task Farming -

  • Multiple Requests To Single Problem

Multiple Requests To Single Problem

♦ (

5 """67"$7

♦ 6 (

.$""""86

♦ "B;

)=

♦ %%) ♦ !1

)

  • Data Persistence

Data Persistence

♦ %%B

B)

♦ <

)$/ !% !)

♦ ,

23B 8)

♦ %B

8)

slide-34
SLIDE 34
  • netsl(“command1”, A, B, C);

netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); Client Server

command1(A, B) result C

Client Server

command2(A, C) result D

Client Server

command3(D, E) result F

netsl_begin_sequence( ); netsl(“command1”, A, B, C); netsl(“command2”, A, C, D); netsl(“command3”, D, E, F); netsl_end_sequence(C, D); Client Server

sequence(A, B, E)

Server Client Server

result F input A, intermediate output C intermediate output D, input E

Data Persistence (cont’d) Data Persistence (cont’d)

  • UCSD (F. Berman, H. Casanova, M. Ellisman), Salk Institute (T.

Bartol), CMU (J. Stiles), UTK (Dongarra, M. Miller, R. Wolski)

  • Study how neurotransmitters diffuse and activate receptors in synapses
  • blue unbounded, red singly bounded, green doubly bounded closed,

yellow doubly bounded open

NPACI Alpha Project NPACI Alpha Project -

  • MCell

MCell: 3 : 3-

  • D Monte

D Monte-

  • Carlo Simulation of

Carlo Simulation of Neuro Neuro-

  • Transmitter Release in Between Cells

Transmitter Release in Between Cells

slide-35
SLIDE 35
  • ♦ #" )

5K%A1- ♦ "$ ) 1!1 F!% ♦ $8 ) ♦ 6! #1%!1) #,1<) ♦ 6#" !%) ♦ #" ( 167""1515%1K)

Web Server NetSolve Client

IPARS-enabled Servers

Web Interface

  • ♦ %!#"
slide-36
SLIDE 36
  • SCIRun torso

defibrillator application – Chris Johnson, U of Utah

Netsolve and SCIRun

  • University of Tennessee Deployment:

University of Tennessee Deployment: S Scalable calable In Intracampus tracampus R Research esearch G Grid: rid: SInRG SInRG

67!%( 1% $)15 %1 $1$)$)

" 1 ! 1

  • !

The Knoxville Campus has two DS-3 commodity Internet connections and one DS-3 Internet2/Abilene connection. An OC-3 ATM link routes IP traffic between the Knoxville campus, National Transportation Research Center, and Oak Ridge National Laboratory. UT participates in several national networking initiatives including Internet2 (I2), Abilene, the federal Next Generation Internet (NGI) initiative, Southern Universities Research Association (SURA) Regional Information Infrastructure (RII), and Southern Crossroads (SoX). The UT campus consists of a meshed ATM OC-12 being migrated over to switched Gigabit by early 2002.

slide-37
SLIDE 37
  • Resources: Grid Service Cluster

Resources: Grid Service Cluster

♦ / !

  • %

#/ ♦

%

  • ♦ !

!

  • ) *

) * ) * ) * +,-* +,-* +,-* +,-* . /0 . /0 . /0 . /0 "1)2-* "1)2-* "1)2-* "1)2-* * * * * * * * * /0 /0 /0 /0 "1)2-* "1)2-* "1)2-* "1)2-* 3/0 3/0 3/0 3/0 1)2-* 1)2-* 1)2-* 1)2-* $41) $41) $41) $41)

  • 56

56 56 56 /0 /0 /0 /0 "+78$ "+78$ "+78$ "+78$ !0 !0 !0 !0 100Mbps 10Mbps 155Mbps 100Mbps 100Mbps 100Mbps 100Mbps

slide-38
SLIDE 38
  • SInRG

SInRG

♦ "/

!

♦ 5%1-8

  • %!%%%-

)

♦ %%

!%)

  • UTK

UTK -

  • SInRG

SInRG

♦ "/ /

%%

$ N H%% %: ♦ I/%!

  • H!
  • "%

#

slide-39
SLIDE 39
  • The Internet Backplane Protocol (IBP)

The Internet Backplane Protocol (IBP)

♦ Network middleware which makes

distributed network storage available as a flexibly allocated resource.

♦ Storage buffers exposed to the

network.

♦ A simple mechanism for

experimenting with allocation and scheduling

  • IBP’s Unit of Storage

IBP’s Unit of Storage

♦ P%

)

♦ P%

;=)

♦ -

)

!% %)

% (

2*3 2.3 52*3

% % %% %% !) !)

slide-40
SLIDE 40
  • IBP Servers

IBP Servers

♦ % ♦ "B) ♦

)

♦ $%) ♦ %

  • Strategy #1:

Strategy #1: Keep data close to the sender Keep data close to the sender (lazy transmission) (lazy transmission)

♦ !K%

)

♦ %) ♦ A8:)

Sender Receiver IBP Network

slide-41
SLIDE 41
  • Strategy #2:

Strategy #2: Place data close to the receiver Place data close to the receiver

♦ H#5255$

%1 3

♦ #) ♦ K B)

Sender Receiver IBP Network

  • Strategy #3:

Strategy #3:

Utilize transient storage throughout Utilize transient storage throughout

♦ 6 ♦ 7 ♦ H#!K%

2K 3"%K- H

58!

!!% #

Sender Receiver IBP IBP IBP

slide-42
SLIDE 42
  • Replicated Services

Replicated Services

  • ♦ $
  • ♦ G-*'
  • GH5K%5)

2GE/H,C''/H3 RF 2G''/H1 3

♦ /#7#,

  • 151

1 $"7 1 ! H5(5"$,"1

  • !%

23123

  • NetSolve:

NetSolve: The The Big Big Picture Picture

AGENT(s)

A C

S1 S2 S3 S4

Client Answer (C)

S2 ! Request

Op(C, A, B)

Matlab Mathematica C, Fortran Java, Excel Schedule Database

No knowledge of the grid required, RPC like. A, B

h a n d l e b a c k

A, B OP, handle

slide-43
SLIDE 43
  • State Management in NetSolve

State Management in NetSolve

♦ %#(

  • ♦ $8
  • For example:

X = F(A, B); Y = G(X, B);

Client

A,B F

Client

X,B G

Client

X Y

Server 1 Server 2

  • Client

A F G

Client

Y

Server 1 Server 2 Client

A,B F A,B G

Client

Y

Server 1 Server 2

X X,B Y

Caching

X B B B

Dependence Flow

IBP Cache

Two Logistical Scheduling Strategies Two Logistical Scheduling Strategies

slide-44
SLIDE 44
  • Stage Data Close to Server

Stage Data Close to Server

♦ $8!%%

) !,H#% %

  • ♦ %S

1 !- H1 I)

  • NetSolve

NetSolve-

  • Things Not Touched On

Things Not Touched On

♦ !%%5 /11!K% ♦ IN.%) ♦ % ! ! ♦ % 5 ♦ 5 ! ♦ 6 ♦ ,/ ♦ ♦ % %%

  • ♦ /"#

H!//6!%:!%6

slide-45
SLIDE 45
  • NSF/NGS

NSF/NGS GrADS GrADS -

  • GrADSoft

GrADSoft Architecture Architecture

♦ /(

%

Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

  • .//01234/56-1

7 568/35)"9330 7(: ;8(<=

  • NSF/NGS

NSF/NGS GrADS GrADS -

  • GrADSoft

GrADSoft Architecture Architecture

♦ /(

%

Whole- Program Compiler Libraries Binder Real-time Performance Monitor Performance Problem Resource Negotiator Scheduler Grid Runtime System Source Appli- cation Config- urable Object Program Software Components Performance Feedback Negotiation

  • .//01234/56-1

7 568/35)"9330 7(: ;8(<=

slide-46
SLIDE 46
  • ScaLAPACK

ScaLAPACK

♦ #I

  • ♦ 8

25##?35#

♦ 7%%!

%

♦ !%!!

%

♦ 67$1 61"# ♦ H51#-816:1

$1 1 /11/15 1Q

?

  • To Use ScaLAPACK a User Must:

To Use ScaLAPACK a User Must:

♦ !%82

#H 1H 1H 1?5#3%%)

♦ K #5!%%

%E- #%% % #5% %%%

♦ %%

%%!

♦ %%

; 9 V=

(%8% %1<%Q

♦ 1%%

slide-47
SLIDE 47
  • ScaLAPACK Grid Enabled

ScaLAPACK Grid Enabled

♦ #I

%%/)

5%A #% #!%%A ♦ 5!%%

!)

♦ %%

;/=% %8 !%%8% )

  • GrADS Numerical Library

GrADS Numerical Library

♦ K%% ♦ 5!%%%

%A%% %% 7<% %%

  • % #5%
  • %%
  • %
slide-48
SLIDE 48
  • User has problem to solve (e.g. Ax = b)

Natural Data (A,b)

Middleware Application Library (e.g. LAPACK, ScaLAPACK, PETSc,…)

Natural Answer (x) Structured Data (A’,b’) Structured Answer (x’)

Big Picture… Big Picture…

  • "
  • ♦ ;=%

!%)

  • $
  • $"5
  • "$

GrADS Library Sequence GrADS Library Sequence

slide-49
SLIDE 49
  • Resource Selector

Resource Selector

5 K %%%%)

E2!13E213 58B

7" 1 %%%%! )

  • "
  • "
  • "
  • "
  • #

5

♦ #5%

%" % )

#%% %%%) %1% ) %) #% ) ;=11)

Performance Model Performance Model

slide-50
SLIDE 50
  • "
  • "
  • #

5

  • Contract Development

Contract Development

♦ %%

)

♦ %!

%#5% !)

♦ 1

% )

  • "
  • "
  • #

5

  • %

; 9% 9 VVV9

Application Launcher Application Launcher

slide-51
SLIDE 51
  • Resource Selector Input

Resource Selector Input

♦ B ES1 1

  • #%5/

6% %2B3 H!% %) # 5 ♦ 58

11 8)

♦ %%!

)

I!!%1% 1%! %%

x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

  • ScaLAPACK Performance Model

ScaLAPACK Performance Model

  • ( , )

f f v v m m

T n p C t C t C t = + +

3

2 3

f

n C p =

2 2

1 (3 log ) 4

v

n C p p = +

2

(6 log )

m

C n p = +

f

t

v

t

m

t

slide-52
SLIDE 52
  • :$$%

5"; <$$%:$ $%5" ;

Performance Model

Library writer to supply

Optimizer

'" '% !$

MDS, NWS Coarse Grid

Resource Selector/Performance Modeler Resource Selector/Performance Modeler

"% % %!% )

% % % )

%#5 %% %" )

% !%% ) ♦

%% 7<)

% % 8) ♦

  • !
  • Performance Model Validation

Performance Model Validation

Speed = 60% of the peak

Opus14 Opus13 Opus16 Opus15 Torc4 Torc6 Torc7 mem(MB) 215 214 227 215 233 479 479 speed 270 270 270 270 330 330 330 load 1 0.99 1 0.99 1 1.04 0.87 Bandwidth Opus14 Opus13 Opus16 Opus15 Torc4 Torc6 Torc7 Opus14

  • 1

248.83 247.31 246.38 2.83 2.83 2.83 Opus13 248.83

  • 1

244.54 240.94 2.83 2.83 2.83 Opus16 247.31 244.54

  • 1

247.54 2.83 2.83 2.83 Opus15 246.38 240.94 247.54

  • 1

2.83 2.83 2.83 Torc4 2.83 2.83 2.83 2.83

  • 1

81.96 56.47 Torc6 2.83 2.83 2.83 2.83 81.96

  • 1

50.9 Torc7 2.83 2.83 2.83 2.83 56.47 50.9

  • 1

Latency in msec

Latency Opus14 Opus13 Opus16 Opus15 Torc4 Torc6 Torc7 Opus14

  • 1

0.24 0.29 0.26 83.78 83.78 83.78 Opus13 0.24

  • 1

0.24 0.23 83.78 83.78 83.78 Opus16 0.29 0.24

  • 1

0.23 83.78 83.78 83.78 Opus15 0.26 0.23 0.23

  • 1

83.78 83.78 83.78 Torc4 83.78 83.78 83.78 83.78

  • 1

0.31 0.31 Torc6 83.78 83.78 83.78 83.78 0.31

  • 1

0.31 Torc7 83.78 83.78 83.78 83.78 0.31 0.31

  • 1

Bandwidth in Mb/s

This is for a refined grid

slide-53
SLIDE 53
  • Experimental Hardware / Software Grid

Experimental Hardware / Software Grid

♦ /*)*)F ♦ E)F ♦ K E)')E ♦ 5#-/*)*)E ♦ #I*)+ ♦ ,H F)')E ♦ H *)* ♦ ##*)*). ♦ / A;=

TORC CYPHER OPUS Type Cluster 8 Dual Pentium III Cluster 16 Dual Pentium III Cluster 8 Pentium II OS Red Hat Linux 2.2.15 SMP Debian Linux 2.2.17 SMP Red Hat Linux 2.2.16 Memory 512 MB 512 MB 128 or 256 MB CPU speed 550 MHz 500 MHz 265 – 448 MHz Network Fast Ethernet (100 Mbit/s) (3Com 3C905B) and switch (BayStack 350T) with 16 ports Gigabit Ethernet (SK- 9843) and switch (Foundry FastIron II) with 24 ports Myrinet (LANai 4.3) with 16 ports each

MacroGrid Testbed

Independent components being put together and interacting

  • PDGESV Time Breakdown

PDGESV Time Breakdown

ScaLAPACK - PDGESV - Using collapsed NWS query from UCSB 42 machine available, using mainly torc, cypher, msc clusters at UTK [Jan 2002] 0.0 1000.0 2000.0 3000.0 4000.0 5000.0 6000.0 Matrix Size - Nproc Time (seconds)

  • ther

PDGESV spawn NWS MDS

  • ther

4.7 5.3 1.0 2.3 7.6 1.1 4.7 PDGESV 58.7 394.7 749.4 1686.7 2747.1 4472.7 5020.4 spawn 92.2 105.9 154.1 124.7 135.6 181.0 264.5 NWS 5.5 7.4 12.3 12.0 4.2 10.2 8.5 MDS 13.0 11.0 10.0 11.0 14.0 73.0 12.0 5000-10 10000-12 15000-14 20000-14 25000-18 30000-18 35000-27

slide-54
SLIDE 54
  • ScaLAPACK across 3 Clusters

500 1000 1500 2000 2500 3000 3500 5000 10000 15000 20000 Matrix Size Time (seconds)

5 OPUS 8 OPUS 8 OPUS 8 OPUS, 6 CYPHER 8 OPUS, 2 TORC, 6 CYPHER 6 OPUS, 5 CYPHER 2 OPUS, 4 TORC, 6 CYPHER 8 OPUS, 4 TORC, 4 CYPHER

OPUS OPUS, CYPHER OPUS, TORC, CYPHER

  • Largest Problem Solved

Largest Problem Solved

♦ 58<F'1''' G)E/H% FE%

%%.*E5H1*E05H

#5%*G%E 0&

F)+/, E*'5, #I*G! .'T #.''5<.''5, 6%E'T% #I

slide-55
SLIDE 55
  • QR

QR – – Timing Breakdown Timing Breakdown

Execution of QR factorization over the grid

500 1000 1500 2000 2500 3000 3500 4000 4000 5000 8000 10000 12000 15000 Matrix Size Time (seconds) Application execution Startup time Performance modeling NWS MDS

3 torcs, 3 mscs 3 torcs, 4 mscs 4 torcs, 6 mscs 4 torcs, 6 mscs 4 torcs, 7 mscs 4 torcs, 7 mscs

  • PDSYEVX

PDSYEVX – – Timing Breakdown Timing Breakdown

PDSYEVX - torcs, cyphers 1000 2000 3000 4000 5000 6000 1000-1 2000-1 3000-1 4000-2 5000-4 7000-5 10000-10 N-nproc Time (s)

  • ther_grid_overhead

perf_modeling_time nws mds pdsyevx_driver_overhead back transformation compute eigenvectors compute eigenvalues tridiagonal reduction Uses torcs only Uses 5 torcs and 5 cyphers

slide-56
SLIDE 56
  • Adhoc

Adhoc vs vs Annealing Scheduling Annealing Scheduling

Estimated Execution Time for PDGESV Using Adhoc Scheduler and Annealing Scheduler Given 57 possible hosts; all GrADS x86 machines

1000 2000 3000 4000 5000 6000 5 ( 1 ) ( 9 ) 1 ( 1 2 ) ( 1 2 ) 1 5 ( 1 4 ) ( 1 6 ) 2 ( 1 4 ) ( 2 ) 2 5 ( 1 8 ) ( 2 ) 3 ( 2 ) ( 2 ) 3 5 ( 2 7 ) ( 2 7 ) N (nproc_adhoc) (nproc_anneal) Time Estimate (sec) est_adhoc est_anneal

  • 200

400 600 800 1000 1200 1400 1600 1800 2000

Time (seconds)

Rescheduling/Redistribution Rescheduling/Redistribution Experimental Results Experimental Results

App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling

Scenario: start application on 4 processors. After 4 minutes into the run 4 addition processors become available. Want to reorganize the computation to take advantage of the extra computing capability. What’s the additional cost?

slide-57
SLIDE 57
  • 200

400 600 800 1000 1200 1400 1600 1800 2000

Time (seconds)

Application execution Redistribute time Checkpoint read+write Restart time Start time Grid overhead NWS

Rescheduling/Redistribution Rescheduling/Redistribution Experimental Results Experimental Results

App: 13600, 4 cyphers No rescheduling App: 13600, 4 cyphers+4 torcs No rescheduling App1: 13600, 4 torcs App2: 13600, 4 torcs+4 cyphers, introduced 4 mins. into the run With rescheduling

About 12% better performance even with redistribution and restart.

  • Major Challenge

Major Challenge -

  • Adaptivity

Adaptivity

♦ %%%:

% B)

)))

<%1 !1 1 !1 %!%!8 )

slide-58
SLIDE 58
  • Conclusion

Conclusion

♦ $8

  • ♦ /%

♦ %/

  • ♦ %

!%!! %

  • Collaborators

Collaborators

/

%%N%1I PI%1I

&& %<2%

  • !$%&$!%

=<%!"&"% / %*1% /$:>%?1$,"$

H#

5%H1I #1I "%K1 H 6H1 1 @!.* ♦ % !1I 1 I% 1I %%N%1I

♦ !

))),,

6

. #I % 8

slide-59
SLIDE 59
  • Major Challenge

Major Challenge -

  • Adaptivity

Adaptivity

♦ %%%:

% B)

)))

<%1 !1 1 !1 %!%!8 )

  • Futures for Numerical Algorithms

Futures for Numerical Algorithms and Software on Clusters and Grids and Software on Clusters and Grids

♦ " - !

!181

♦ !

)

  • 18)

%1

  • ♦ %!

%)

*+1FE1+&1*E0)

♦ "11 ♦

%)

slide-60
SLIDE 60
  • Conclusion

Conclusion

♦ $8

  • ♦ !%

♦ %/

  • ♦ %

!%!! %

  • Vinny’s

Vinny’s Bad Day Bad Day

♦ %/!

  • )