[PPT] - Uniprocessor Scheduling Basic Concepts Scheduling Criteria PowerPoint Presentation

SLIDE 1

Uniprocessor Scheduling

Basic Concepts
Scheduling Criteria
Scheduling Algorithms

SLIDE 2

Three level scheduling

2

SLIDE 3

Types of Scheduling yp g

3

SLIDE 4

Long- and Medium-Term Schedulers g

Long-term scheduler

Determines which programs are admitted to the system

(ie to become processes) b d d f h h l d

requests can be denied if e.g. thrashing or overload

Medium-term scheduler

decides when/which processes to suspend/resume
Both control the degree of multiprogramming

– More processes, smaller percentage of time each process is executed executed

4

SLIDE 5

Short-Term Scheduler

Decides which process will be dispatched; invoked

upon

– Interrupts, Operating system calls, Signals, ..

Dispatch latency – time it takes for the dispatcher

to stop one process and start another running; the to stop one process and start another running; the dominating factors involve:

– switching context g – selecting the new process to dispatch

5

SLIDE 6

CPU–I/O Burst Cycle y

Process execution consists of a

cycle of cycle of

–

CPU execution and – I/O wait.

A process may be

– CPU-bound – IO-bound

6

SLIDE 7

Scheduling Criteria- Optimization goals g p g

CPU utilization – keep CPU as busy as possible Throughput – # of processes that complete their execution per time unit Response time – amount of time it takes from when a request was Response time amount of time it takes from when a request was submitted until the first response is produced (execution + waiting time in ready queue)

T d ti t f ti t t ti l – Turnaround time – amount of time to execute a particular process (execution + all the waiting); involves IO schedulers also

Fairness - watch priorities, avoid starvation, ... Scheduler Efficiency - overhead (e.g. context switching, computing priorities, …)

7

SLIDE 8

Decision Mode

Nonpreemptive O i i th i t t it ill ti til it

Once a process is in the running state, it will continue until it

terminates or blocks itself for I/O Preemptive

Currently running process may be interrupted and moved to the

R d t t b th ti t Ready state by the operating system

Allows for better service since any one process cannot monopolize

the processor for very long p y g

8

SLIDE 9

First-Come-First-Served (FCFS) (FCFS)

5 10 15 20 5 10 15 20 A B C E D

non-preemptive
Favors CPU bound processes
Favors CPU-bound processes
A short process may have to

wait very long before it can

9

wait very long before it can execute (convoy effect)

SLIDE 10

Round-Robin

5 5 10 15 20 A B C D E

ti b d l k (i t t ti

preemption based on clock (interrupts on time

slice or quantum -q- usually 10-100 msec)

fairness: for n processes, each gets 1/n of the

CPU time in chunks of at most q time units CPU time in chunks of at most q time units

Performance

– q large ⇒ FIFO ll h d b hi h d t

10

– q small ⇒ overhead can be high due to context switches

SLIDE 11

Shortest Process First

5 10 15 20 10 15 20 A B C D E

Non-preemptive
Short process jumps ahead of

l longer processes

Avoid convoy effect

11

SLIDE 12

Shortest Remaining Time First g

5 10 15 20 5 10 15 20 A B C D E

Preemptive (at arrival)

version of shortest process next next

12

SLIDE 13

On SPF Scheduling

gives high throughput

g s h gh throughput

gives minimum (optimal) average response (waiting) time

for a given set of processes g p

– Proof (non-preemptive): analyze the summation giving the waiting time

must estimate processing time (next cpu burst)

– Can be done automatically (exponential averaging) – If estimated time for process (given by the user in a batch system) not correct the operating system may abort it system) not correct, the operating system may abort it

possibility of starvation for longer processes

13

SLIDE 14

Determining Length of Next CPU Burst g g

Can be done by using the length of previous CPU bursts, using

exponential averaging exponential averaging.

burst CPU next the for value predicted 2 burst CPU

f

lenght actual 1. = = τ

1 th n

n t : Define 4. 1 , 3. burst CPU next the for value predicted 2. ≤ ≤

+

α α τ

1 n

( )

. t

n n n

τ α α τ − + =

=

1

14

SLIDE 15

On Exponential Averaging p g g

α =0

– τn+1 = τn – history does not count, only initial estimation counts

α =1

α 1 – τn+1 = tn – Only the actual last CPU burst counts.

If we expand the formula, we get:

τn+1 = α tn+(1 - α) α tn -1 + … +(1 α )j α t + +(1 - α )j α tn -i + … +(1 - α )n τ0

Since both α and (1 - α) are less than or equal to 1, each successive

term has less weight than its predecessor.

15

SLIDE 16

Prediction of the Length of the Next CPU Burst

SLIDE 17

Priority Scheduling: General Rules y g

Scheduler can choose a process of higher priority

f l i it

ver one of lower priority

– can be preemptive or non-preemptive – can have multiple ready queues to represent multiple level of – can have multiple ready queues to represent multiple level of priority

Example Priority Scheduling: SPF, where priority is

p y g , p y the predicted next CPU burst time.

Problem ≡ Starvation – low priority processes may

p y p y never execute.

A solution ≡ Aging – as time progresses increase the

f h priority of the process.

17

SLIDE 18

Priority Scheduling Cont. : Multilevel Queue y g Q

Ready queue is partitioned into separate

queues eg queues, eg

foreground (interactive) background (batch)

Each queue has its own scheduling
Each queue has its own scheduling

algorithm, eg

foreground – RR b k d FCFS background – FCFS

Scheduling must be done between the

queues.

– Fixed eg., serve all from foreground then from background. Possible starvation. – Another solution: Time slice – each queue t f ti f CPU ti t di id gets a fraction of CPU time to divide amongst its processes, eg. 80% to foreground in RR 20% t b k d i FCFS

18

20% to background in FCFS

SLIDE 19

Multilevel Queue Scheduling Q g

SLIDE 20

Multilevel Feedback Queue Q

A process can move

b t th i between the various queues; aging can be implemented this way implemented this way.

scheduler parameters:
scheduler parameters:

– number of queues – scheduling algorithm for each g g m f queue – method to upgrade a process th d t d t – method to demote a process – method to determine which queue a process will enter

20

q p first

SLIDE 21

Multilevel Feedback Queues Q

21

SLIDE 22

Thread Scheduling

M t d t d l th d

Many-to-one and many-to-many models, thread

library schedules user-level threads to run on LWP LWP

– Known as process-contention scope (PCS) since scheduling competition is within the process

Kernel thread scheduled onto available CPU is

system-contention scope (SCS) – competition among all threads in system

E.g. Pthreads scheduling API allows specifying

ith PCS SCS d i th d ti either PCS or SCS during thread creation

SLIDE 23

Fair-Share Scheduling

extention of multi-level queues with feedback +

t nt on of mu t qu u s w th f ac priority recomputation

– application runs as a collection of processes (threads) – concern: the performance of the application, user-groups, … (ie. group of processes/threads) – scheduling decisions based on process sets rather than scheduling decisions based on process sets rather than individual processes

eg. “traditional” unix sheduling

g g

23

SLIDE 24

Real-Time Scheduling

SLIDE 25

Real-Time Systems y

Tasks or processes attempt to interact with outside-world events ,

which occur in “real time”; process must be able to keep up e g which occur in real time ; process must be able to keep up, e.g.

– Control of laboratory experiments, Robotics, Air traffic control, Drive-by- wire systems, Tele/Data-communications, Military command and control systems y

Correctness of the RT system depends not only on the logical result of

the computation but also on the time at which the results are produced i.e. Tasks or processes come with a deadline (for starting or completion) Requirements may be hard or soft

25

SLIDE 26

Periodic Real-TimeTasks: Timing Diagram Timing Diagram

26

SLIDE 27

E.g. Multimedia Process Scheduling

27

A movie may consist of several files

SLIDE 28

E.g. Multimedia Process Scheduling (cont) g g ( )

Periodic processes displaying a movie
Frame rates and processing requirements may be

different for each movie (or other process that i i )

28

requires time guarantees)

SLIDE 29

Scheduling in Real-Time Systems Schedulable real-time system

Given

– m periodic events m per od c events – event i occurs within period Pi and requires Ci seconds

Then the load can only be handled if

Then the load can only be handled if

1

m i

C P ≤

∑

Utilization =

1 i i

P

=

∑

29

SLIDE 30

Scheduling with deadlines: Earliest Deadline First Earliest Deadline First

Set of tasks with deadlines is schedulable (i.e can be executed in a way that no process misses its deadline) iff EDF is a schedulable (aka feasible) no process misses its deadline) iff EDF is a schedulable (aka feasible)

sequence. Example sequences:

30

SLIDE 31

Rate Monotonic Scheduling

Assigns priorities to tasks on the basis of their periods

Assigns priorities to tasks on the basis of their periods

Highest-priority task is the one with the shortest period

31

SLIDE 32

EDF or RMS? (1)

32

SLIDE 33

EDF or RMS? (2) Another example of real time scheduling with RMS and EDF

33

Another example of real-time scheduling with RMS and EDF

SLIDE 34

EDF or RMS? (3)

RMS “accomodates” task set with less utilization

1

m i

C ≤

∑

0.7

1 i i

P

=

∑

– (recall: for EDF that is up to 1)

RMS is often used in practice;

– main reason: stability is easier to meet with RMS; priorities are static hence under transient period with deadline misses are static, hence, under transient period with deadline-misses, critical tasks can be “saved” by being assigned higher (static) priorities

34

– it is ok for combinations of hard and soft RT tasks

SLIDE 35

Multipr cess r Systems Multiprocessor Systems

Scheduling

SLIDE 36

Multiprocessors Definition: A computer system in which two or more CPUs share full access to a common RAM CPUs share full access to a common RAM

36

SLIDE 37

Multiprocessor/Multicore Hardware (ex.1) p ( ) Bus-based multiprocessors

37

SLIDE 38

Multiprocessor/Multicore Hardware (ex.2) UMA ( nif m m m ss) UMA (uniform memory access)

Not/hardly scalable

Bus-based architectures -> saturation
Crossbars too expensive (wiring constraints)

Possible solutions

Reduce network traffic by caching

38

Reduce network traffic by caching

Clustering -> non-uniform memory latency

behaviour (NUMA.

SLIDE 39

Multiprocessor/Multicore Hardware (ex.3) NUMA (n n nif m m m ss) NUMA (non-uniform memory access)

Single address space visible to all CPUs
Access to remote memory slower than to local

C h t ll /MMU d t i h th f i l l

Cache-controller/MMU determines whether a reference is local or

remote

When caching is involved, it's called CC-NUMA (cache coherent

39

g , ( NUMA)

SLIDE 40

Hyperthreading yp g

HW i i f lti l HW gives image of multiple processors per processor OS can be oblivious; but will benefit from knowing that it runs on such a HW it runs on such a HW

logical CPU = state only (not execution unit)

SLIDE 41

On multicores

Reason for multicores: physical limitations can cause significant heat dissipation by high clock rate; instead, parallelize within the same chip! parallelize within the same chip! In addition to operating system (OS) In addition to operating system (OS) support, adjustments to existing software are required to maximize utilization of the computing resources provided by multi-core processors.

Intel Core 2 dual core processor, with CPU-local

Virtual machine approach again in focus

p Level 1 caches+ shared,

n-die Level 2 cache.

41

focus.

SLIDE 42

On multicores (cont) ( )

Also possible (figure from

www.microsoft.com/licensing/highlights/multicore.mspx)

42

SLIDE 43

OS Design issues (1): Who executes the OS/scheduler(s)? Who executes the OS/scheduler(s)?

Master/slave architecture: Key kernel functions always run on a
Master/slave architecture: Key kernel functions always run on a

particular processor

– Master is responsible for scheduling; slave sends service request to th t the master – Disadvantages

Failure of master brings down whole system
Master can become a performance bottleneck
Peer architecture: Operating system can execute on any processor

– Each processor does self-scheduling Each processor does self scheduling – New issues for the operating system

Make sure two processors do not choose the same process

43

SLIDE 44

Master-Slave multiprocessor OS p

Bus

44

SLIDE 45

Non-symmetric Peer Multiprocessor OS y p

Bus

Each CPU has its own operating system

45

SLIDE 46

Symmetric Peer Multiprocessor OS y p

Symmetric Multiprocessors

Bus

Symmetric Multiprocessors

– SMP multiprocessor model

46

SLIDE 47

Scheduling in Multiprocessors g p

Recall: Tightly coupled multiprocessing (SMPs)

– Processors share main memory – Processors share main memory – Controlled by operating system

Different degrees of parallelism

– Independent and Coarse-Grained Parallelism

no or very limited synchronization
no or very limited synchronization
can by supported on a multiprocessor with little change (and a bit
f salt ☺)

Medium Grained Parallelism – Medium-Grained Parallelism

collection of threads; usually interact frequently

– Fine-Grained Parallelism

Highly parallel applications; specialized and fragmented area

47

SLIDE 48

Design issues 2: Assignment of Processes to Processors Assignment of Processes to Processors

P d l b l d Per-processor ready-queues vs global ready-queue

Permanently assign process to a processor;

– Less overhead – A processor could be idle while another processor has a backlog

Have a global ready queue and schedule to any available processor

– can become a bottleneck can become a bottleneck – Task migration not cheap (cf. NUMA and scheduling)

Processor affinity – process has affinity for processor on which it

Processor affinity process has affinity for processor on which it is currently running

– soft affinity – hard affinity

48

SLIDE 49

Multiprocessor Scheduling: per processor or per partition RQ per processor or per-partition RQ

Space sharing

– multiple threads at same time across multiple CPUs

49

SLIDE 50

Multiprocessor Scheduling: Load sharing / Global ready queue Load sharing / Global ready queue

Timesharing

– note use of single data structure for scheduling

50

g g

SLIDE 51

Multiprocessor Scheduling Load Sharing: a problem Load Sharing: a problem P bl ith i ti b t t th d

Problem with communication between two threads

– both belong to process A b th i t f h s

51

– both running out of phase

SLIDE 52

Design issues 3: Multiprogramming on processors? Multiprogramming on processors?

Experience shows: Experience shows: – Threads running on separate processors (to the extend of dedicating a processor to a thread) yields dramatic gains in performance performance – Allocating processors to threads ~ allocating pages to ( ki d l?) processes (can use working set model?) – Specific scheduling discipline is less important with more than

n processor; the decision of “distributing” tasks is more

important

52

SLIDE 53

Gang Scheduling Gang Sch u ng

Approach to address the prev. problem :

1 Groups of related threads scheduled as a unit (a gang) 1. Groups of related threads scheduled as a unit (a gang) 2. All members of gang run simultaneously

n different timeshared CPUs

3. All gang members start and end time slices together

53

SLIDE 54

Gang Scheduling: another option g g p

54

SLIDE 55

Multiprocessor Thread Scheduling Dynamic Scheduling Dynamic Scheduling

Number of threads in a process are altered dynamically by the

application application

Programs (through thread libraries) give info to OS to manage

parallelism

OS adjusts the load to improve use – OS adjusts the load to improve use

Or os gives info to run-time system about available processors, to

adjust # of threads. i d i i f i i i

i.e dynamic vesion of partitioning:

55

SLIDE 56

Multithreaded Multicore System y

Solution by architecture: hyperthreading Needs OS awareness though to get the corresponding efficiency

SLIDE 57

Summary: Multiprocessor Thread Scheduling y p g

Load sharing: processors/threads not assigned to particular Load sharing: processors/threads not assigned to particular processors

load is distributed evenly across the processors;

n ds central queue; m b b ttl n ck

needs central queue; may be a bottleneck
preemptive threads are unlikely to resume execution on the same

processor; cache use is less efficient

G h d li A i th d t ti l Gang scheduling: Assigns threads to particular processors (simultaneous scheduling of threads that make up a process)

Useful where performance severely degrades when any part of the

p y g y p application is not running (due to synchronization)

Extreme version: Dedicated processor assignment (no

multiprogramming of processors) p g g p )

57

SLIDE 58

Operating System Examples p g y p

Solaris scheduling
Windows XP scheduling
Linux scheduling

SLIDE 59

Solaris Scheduling

Kernel preemptible A la Multilevel feedback p mp by RT tasks in multiprocessors ( l feedback queues (unless interrupts disabled) disabled)

SLIDE 60

Solaris Dispatch Table p

SLIDE 61

Windows XP Priorities

SLIDE 62

Linux Scheduling

Two priority ranges: time-sharing and

Two priority ranges: time sharing and real-time

One queue per processor/core

One queue per processor/core

SLIDE 63

List of Tasks Indexed According to Priorities