Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron - - PDF document

plug in scheduler design for a distributed environm ent
SMART_READER_LITE
LIVE PREVIEW

Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron - - PDF document

Plug-in Scheduler Design for a Distributed Environm ent Eddy Caron Andreea Chis Frederic Desprez Alan Su ANR-05-CIGC-11 Outline 1 Grid and Grid-RPC 2 Related Work 3 DIET Overview 4 DIET Scheduling 5 Experimentation 6


slide-1
SLIDE 1

1

Plug-in Scheduler Design for a Distributed Environm ent

Eddy Caron Andreea Chis Frederic Desprez Alan Su ANR-05-CIGC-11

2

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

slide-2
SLIDE 2

2

3

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

4

Grid and Grid-RPC

Com putational Grids

Sharing Selection Aggregation

  • f

geographically distributed computational resources presenting them as a single unified resource for solving large-scale compute and data intensive applications.

Grid Platform s Heterogeneous computational resources Irregular network topologies Dynamic resource performance

slide-3
SLIDE 3

3

5

Grid and Grid-RPC

Resource m anagem ent – crucial aspect for efficient Grid environments Application Service Provider (ASP) model Semi-transparent access to computing servers Coarse granularity Easy to use even for non-experts Close to the Remote Procedure Call (RPC) paradigm Middleware to facilitate the clients access to remote resources Network Enabled Servers (NES) : Ninf, NetSolve, OmniRPC, DIET.

6

The Grid-RPC Paradigm

Elaborated by the Global Grid Forum (OGF now) Standardizes the API of the RPC used by NES Based on 5 entities :

Client - user’s interface & request submission to servers Server - receive clients’ requests & executes software modules

  • n behalf of them

Data-base - static and dynamic information about hardware and software resources Scheduler - requests receival & mapping decision based on info from database Monitor - observations about resources status & information storing in the database

slide-4
SLIDE 4

4

7

The Grid-RPC Paradigm

Agent

` Client Request Identifier

Server

Registering Call Results

8

The Grid-RPC Paradigm

AGENT ( Registry or Resource Managem ent System ) Central component of Grid-RPC systems Chooses servers able to solve a request on behalf of clients Main task: load-balancing between servers Gets information about available servers Asks the performance database for information Some scalability problems may occur Unique scheduler Unique resource management system Centralized (or duplicated) in NetSolve or Ninf Distributed in DIET

slide-5
SLIDE 5

5

9

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

10

Related W ork

Few middleware allow scheduling internals to be tuned for specific application APST – allows choosing the scheduling heuristic (Max-min, Min-min, X-sufferage) GrADS & AppLeS – scheduling heuristics for different application classes GrADS - application specific performance models

  • Program Preparation System
  • Program Execution System
  • Binder

Recent work towards coping with dynamic platform performance at runtime Condor – ClassAds language

slide-6
SLIDE 6

6

11

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

12

DI ET Overview

DIET for Distributed I nteractive Engineering Toolbox Hierarchical architecture for improved scalability Implemented in CORBA thus beneficiating from standardized, stable services provided by freely- available and high performance CORBA implementations

slide-7
SLIDE 7

7

13

DI ET Overview

DI ET Platform Architecture

Client

Client - an application that uses the DIET infrastructure to solve problems Servers (SeDs) - perform computations on data sent by a client Agents - facilitate the service location and invocation interactions

  • f clients and SeDs

SeD SeD SeD

MA

LA

LA SeD SeD SeD SeD

MA

LA

LA SeD

14

DI ET Overview

Progress of a DI ET call

MA

` Client

LA SeD SeD SeD

The client requests a service from a Master Agent MA propagates the client request through its subtrees Each SeD responds with a profile and performance estimation LAs sort children responses and forward them up in the hierarchy MA returns a list of candidate SeDs to the Client Client sends input data and the SeD launches the service

slide-8
SLIDE 8

8

15

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

16

DI ET Scheduling First version - FIFO principle Mono-criteria scheduling based on application-specific performance predictions Round-robin scheduling scheme Plug-in schedulers

slide-9
SLIDE 9

9

17

DI ET Scheduling

SeD level Performance estimation vector - dynamic collection of performance estimation values (modular design)

  • Performance measures available through DIET

» FAST-NWS performance metrics » Time elapsed since the last execution » CoRI-Easy

  • Developer defined values

Performance Estimation Function Standard estimation tags for accessing the fields of an performance estimation vector:

» EST_FREEMEM » EST_TCOMP » EST_TIMESINCELASTSOLVE » EST_FREECPU

18

Diet Scheduling

Standard Estimation Tags

[ empty] fill all possible fields x ALLINFOS average time to write to disk(Mb/ s) DISKACCESSWRITE average time to read from disk(Mb/ s) DISKACCESSREAD amount of free place on partition(Mb) FREESIZEDISK size of the partition(Mb) TOTALSIZEDISK cache size CPUs(Kb) x CACHECPU the BogoMips x BOGOMIPS total memory size(Mb) TOTALMEM frequency of CPUs(MHz) x CPUSPEED number of available processors NBCPU amount of free memory FREEMEM CPU load average LOADAVG amount of free CPU (between 0 and 1) FREECPU time since last execution start (sec) TIMESINCELASTSOLVE the predicted time to solve a problem TCOMP Explanation Multi

  • value

I nform ation tag Starts w ith EST_

slide-10
SLIDE 10

10

19

DI ET Scheduling

Aggregation Methods Defining mechanism how to sort SeD responses: associated with the service and defined at the SeD level Tunable comparison/aggregation routines for scheduling Priority Scheduler

  • Performs pairwise server estimation comparisons returning a

sorted list of server responses;

  • Can minimize or maximize based on SeD estimations and

taking into consideration the order in which the request for those performance estimations was specified at SeD level.

20

DI ET Scheduling

Collector of Resource I nformation (CoRI) Platform performance subsystem Enables easy interfacing with third party performance monitoring and prediction tools Aims Provide basic measurements that are available regardless

  • f the state of the system

Manage the simultaneous use of different performance prediction systems within a single heterogeneous platform

CoRI-Easy Module CoRI Manager

slide-11
SLIDE 11

11

21

DI ET Scheduling-CoRI Manager

Access to different collectors Modular design Great deal of extensibility

CoRI-Easy Collector FAST Collector

CoRI Manager

Other Collectors like Ganglia

22

CoRI collector - FAST

[Martin Quinson. PhD thesis. 2003.]

slide-12
SLIDE 12

12

23

CoRI collector – CoRI -Easy Resource collector that provides basic performance measurements of the SeD Extensible like CORI Manager Information available CPU evaluation (nb, frequency, cache size BogoMips, load average, utilization) Memory capacity (total size, available) Disk performance and capacity (read-write speed, capacity, free capacity)

24

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

slide-13
SLIDE 13

13

25

Experim entation

MA SeD SeD SeD SeD SeD LA LA SeD

Toy platform (ENS-Lyon/France):

  • 7 Servers P4 2.4GHz

Memory: 256Mo

  • 2 Servers Intel P4 XEON 2.4GHz

Memory 1Go

26

CPU Experim entation

CPU Scheduler – priority scheduler that maximizes the ratio RR Scheduler (Round Robin) -priority scheduler that maximizes the time elapsed since the last execution start BOGOMIPS 1+ loadaverage

slide-14
SLIDE 14

14

27

The CPU Scheduler

Request interleave time to 5 seconds CPU scheduler

28

The CPU Scheduler

Requests interleave time to 5 seconds Round Robin scheduler

slide-15
SLIDE 15

15

29

The CPU Scheduler

Requests interleave time to 10 seconds CPU scheduler

30

The CPU Scheduler

Requests interleave time to 10 seconds Round Robin scheduler

slide-16
SLIDE 16

16

31

The CPU Scheduler

Requests interleave time to 1 minute CPU scheduler

32

The CPU Scheduler

Requests interleave time to 1 minute RR scheduler

slide-17
SLIDE 17

17

33

The CPU Scheduler CPU vs RR scheduler total computation time 1 minute request inter-arrival time

34

The CPU Scheduler

Requests interleave time to 1 minute CPU scheduler - average task time ~ 2 minutes

slide-18
SLIDE 18

18

35

The CPU Scheduler

Requests interleave time to 1 minute CPU scheduler - average task time ~ 3 minutes

36

I / O Experim entation

I/ O Scheduler: priority scheduler that maximizes the disk write speed RR Scheduler (Round Robin) priority scheduler that maximizes the time elapsed since the last execution start

slide-19
SLIDE 19

19

37

I / O Experim entation Requests interleave time to 25 seconds I/ O scheduler

38

I / O Experim entation Requests interleave time to 25 seconds RR scheduler

slide-20
SLIDE 20

20

39

I / O Experim entation Requests interleave time to 25 seconds RR scheduler – 100 requests

40

I / O Experim entation Requests interleave time to 35 seconds I/ O scheduler – 100 requests

slide-21
SLIDE 21

21

41

I / O Experim entation Requests interleave time to 35 seconds RR scheduler – 100 requests

42

I / O Experim entation I/ O vs RR total computation time 35s request inter-arrival time

slide-22
SLIDE 22

22

43

Outline

DIET Overview

3

Grid and Grid-RPC

1

DIET Scheduling

4

Related Work

2

Experimentation

5

Conclusion and future work

6

44

Conclusion and Future W ork

Conclusion New functionality added to DIET Plug-in schedulers CoRI Collectors : CoryEasy, Fast 2 plug-in schedulers & proof of concept Future W ork Improve CPU scheduler sensitivity Integration of data-aware scheduling Use of plug-ins with workflow management Integration of new collectors (Ganglia…) Improve plug-ins New plug-ins for real applications

slide-23
SLIDE 23

23

45

Questions?

http: / / graal.ens-lyon.fr/ DIET