CS137: Today Electronic Design Automation Scheduling - - PDF document

cs137 today electronic design automation
SMART_READER_LITE
LIVE PREVIEW

CS137: Today Electronic Design Automation Scheduling - - PDF document

CS137: Today Electronic Design Automation Scheduling Force-Directed SAT/ILP Day 20: November 23, 2005 Branch-and-Bound Scheduling Variants and Approaches 1 2 CALTECH CS137 Fall2005 -- DeHon CALTECH CS137 Fall2005 --


slide-1
SLIDE 1

1

CALTECH CS137 Fall2005 -- DeHon 1

CS137: Electronic Design Automation

Day 20: November 23, 2005 Scheduling Variants and Approaches

CALTECH CS137 Fall2005 -- DeHon 2

Today

  • Scheduling

– Force-Directed – SAT/ILP – Branch-and-Bound

CALTECH CS137 Fall2005 -- DeHon 3

Last Time

  • Resources aren’t free
  • Share to reduce costs
  • Schedule operations on resources
  • Greedy approximation algorithm

CALTECH CS137 Fall2005 -- DeHon 4

Force-Directed

  • Problem: how exploit schedule freedom

(slack) to minimize instantaneous resources

– Directly solve time constrained – Trying to minimize resources

CALTECH CS137 Fall2005 -- DeHon 5

Force-Directed

  • Given a node, can schedule anywhere

between ASAP and ALAP schedule time

– Between latest schedule predecessor and ALAP – Between ASAP and already scheduled successors

  • N.b.: Scheduling node will limit freedom
  • f nodes in path

CALTECH CS137 Fall2005 -- DeHon 6

Single Resource Challenge

A7 A8 B11 A9 B2 B3 B4 A1 A2 A3 A4 A5 A6 A10 A11 A13 A12 B5 B1 B6 B7 B8 B9 B10

slide-2
SLIDE 2

2

CALTECH CS137 Fall2005 -- DeHon 7

Force-Directed

  • If everything where scheduled, except

for the target node, we would:

– examine resource usage in all timeslots allowed by precedence – place in timeslot which has least increase maximum resources

CALTECH CS137 Fall2005 -- DeHon 8

Force-Directed

  • Problem: don’t know resource

utilization during scheduling

  • Strategy: estimate resource utilization

CALTECH CS137 Fall2005 -- DeHon 9

Force-Directed Estimate

  • Assume a node is uniformly distributed

within slack region

– between earliest and latest possible schedule time

  • Use this estimate to identify most used

timeslots

CALTECH CS137 Fall2005 -- DeHon 10

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 11

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 12

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

slide-3
SLIDE 3

3

CALTECH CS137 Fall2005 -- DeHon 13

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 14

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 3/9 2 1/9

CALTECH CS137 Fall2005 -- DeHon 15

Force-Directed

  • Scheduling a node will shift distribution

– all of scheduled node’s cost goes into one timeslot – predecessor/successors may have freedom limited so shift their contributions

  • Want to shift distribution to minimize

maximum resource utilization (estimate)

CALTECH CS137 Fall2005 -- DeHon 16

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 Repeat

CALTECH CS137 Fall2005 -- DeHon 17

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 3/9 2 1/9 Repeat

CALTECH CS137 Fall2005 -- DeHon 18

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

slide-4
SLIDE 4

4

CALTECH CS137 Fall2005 -- DeHon 19

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 3 4/9

CALTECH CS137 Fall2005 -- DeHon 20

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 21

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 3 2/9

CALTECH CS137 Fall2005 -- DeHon 22

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 3/9 2 1/9

CALTECH CS137 Fall2005 -- DeHon 23

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 3/9

CALTECH CS137 Fall2005 -- DeHon 24

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 3/9

slide-5
SLIDE 5

5

CALTECH CS137 Fall2005 -- DeHon 25

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 26

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 3

CALTECH CS137 Fall2005 -- DeHon 27

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 28

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 13/18

CALTECH CS137 Fall2005 -- DeHon 29

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 30

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 3 2/9

slide-6
SLIDE 6

6

CALTECH CS137 Fall2005 -- DeHon 31

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 13/18

CALTECH CS137 Fall2005 -- DeHon 32

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 13/18

CALTECH CS137 Fall2005 -- DeHon 33

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 34

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 13/18

CALTECH CS137 Fall2005 -- DeHon 35

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 36

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 13/18

slide-7
SLIDE 7

7

CALTECH CS137 Fall2005 -- DeHon 37

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 2 13/18

CALTECH CS137 Fall2005 -- DeHon 38

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 39

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8

CALTECH CS137 Fall2005 -- DeHon 40

Single Resource Challenge

2 2 8 2 8 8 8 2 2 2 2 2 2 2 2 2 2 8 8 8 8 8 8 8 Many steps…

CALTECH CS137 Fall2005 -- DeHon 41

Force-Directed Algorithm

1. ASAP/ALAP schedule to determine range of times for each node 2. Compute estimated resource usage 3. Pick most constrained node (in largest time slot…)

– Evaluate effects of placing in feasible time slots (compute forces) – Place in minimum cost slot and update estimates – Repeat until done

CALTECH CS137 Fall2005 -- DeHon 42

Time

  • Evaluate force of putting in timeslot

O(NT)

– Potentially perturbing slack on net prefix/postfix for this node N – Each node potentially in T slots

  • Evaluate all timeslots can put in O(NT2)
  • N nodes to place
  • O(N2T2)

– Loose bound--don’t get both T slots and N perturbations

slide-8
SLIDE 8

8

CALTECH CS137 Fall2005 -- DeHon 43

SAT/ILP (Integer-Linear Programming)

CALTECH CS137 Fall2005 -- DeHon 44

Two Constraint Challenge

  • Processing elements have limited

memory

– Instruction memory (data memory)

  • Tasks have different requirements for

compute and instruction memory

– i.e. Run length not correlated to code length

CALTECH CS137 Fall2005 -- DeHon 45

Task

  • Task: schedule tasks onto PEs obeying

both memory and compute capacity limits

Example and ILP solution From Plishker et al. NSCD2004

CALTECH CS137 Fall2005 -- DeHon 46

Task

  • Task: schedule tasks onto PEs obeying

both memory and compute capacities

  • two capacity partitioning problem

– …actually, didn’t say anything about communication…

  • two capacity bin packing problem
  • Task: i <Ci,Ii>

CALTECH CS137 Fall2005 -- DeHon 47

SAT Packing

  • Ai,j – task i assigned to resource j

Constraints

  • Coverage constraints
  • Uniqueness constraints
  • Cardinality constraints

– PE compute – PE memory

CALTECH CS137 Fall2005 -- DeHon 48

Allow Code Sharing

  • Two tasks of same type can share code
  • Instead of memory capacity

– Vector of memory usage

  • Compute PE Imem vector

– As OR of task vectors assigned to it

  • Compute mem space as sum of non-

zero vector entries

slide-9
SLIDE 9

9

CALTECH CS137 Fall2005 -- DeHon 49

Allow Code Sharing

  • Two tasks of same type can share code
  • Task has vector of memory uage

– Task i needs set of instructions k: Ti,k

  • Compute PE Imem vector

– OR (all i): PE.Imemj,k+=Ai,j * Ti,k

  • PE Mem space

– PE.Total_Imemj= Σ(PE.Imemj,k*Instrs(k))

CALTECH CS137 Fall2005 -- DeHon 50

Symmetries

  • As with partitioning, many symmetries
  • Speedup with symmetry breaking

– Tasks in same class are equivalent – PEs indistinguishable – Total ordering on tasks and PEs – Add constraints to force tasks to be assigned to PEs by ordering – Plishker claims “significant runtime speedup” – Using GALENA [DAC 2003] psuedo-Boolean SAT solver

CALTECH CS137 Fall2005 -- DeHon 51

Plishker Task Example

CALTECH CS137 Fall2005 -- DeHon 52

Results

SAT/ILP Solve Greedy (first-fit) binpack Solutions in < 1 second

CALTECH CS137 Fall2005 -- DeHon 53

Why can they do this?

  • Ignore precedence?
  • Ignore Interconnect?

CALTECH CS137 Fall2005 -- DeHon 54

Why can they do this?

  • Ignore precedence?

– feed forward, buffered

  • Ignore Interconnect?

– Through shared memory, not dominant?

slide-10
SLIDE 10

10

CALTECH CS137 Fall2005 -- DeHon 55

Interconnect Buffers

  • Allow “Software Pipelining”

Each data item Spatial we would pipeline, running all three at once Think of each schedule instance as one timestep in spatial pipeline.

CALTECH CS137 Fall2005 -- DeHon 56

Interconnect Buffer

A B C 50 100 50 A B C PE0 PE1 A B C A B C A B C A B C A B C A B C A B C A B C

CALTECH CS137 Fall2005 -- DeHon 57

Add Precedence to SAT/ILP?

  • Assign start time to each task
  • Precedence: constrain start of each

task to be greater than start+run of each predecessor

  • Time Exclusivity: constrain non-
  • verlap of startstart+run-1 on nodes
  • n same PE

– Maybe formulate as order on PE – And make PE order predecessor like a task predecessor?

Untested conjecture

CALTECH CS137 Fall2005 -- DeHon 58

Memory Schedule Variants

  • Persistent: holds memory whole time

– E.g. task state, instructions

  • Task temporary: only uses memory

space while task running

  • Intra-Task: use memory between point
  • f production and consumption

– E.g. Def-Use chains

CALTECH CS137 Fall2005 -- DeHon 59

Memory Schedule Variants

  • Persistent:

– Binpacking in memory

  • Task temporary:

– Co-schedule memory slot with execution

  • Intra-Task:

– Lifetime in memory depends on scheduling def and last use – Phase Ordered: Register coloring

CALTECH CS137 Fall2005 -- DeHon 60

Branch-and-Bound

slide-11
SLIDE 11

11

CALTECH CS137 Fall2005 -- DeHon 61

Brute-Force

  • Try all schedules
  • Branching/Backtracking Search
  • Start w/ nothing scheduled (ready

queue)

  • At each move (branch) pick:

– available resource time slot – ready task (predecessors completed) – schedule task on resource

CALTECH CS137 Fall2005 -- DeHon 62

Example

T1 T2 T3 T4 T5 T6 T1→time 1 T3→time 1 T2→time 1 idle→time 1 T3→time2 T4→time 2 idle→time 2 Target: 2 FUs

CALTECH CS137 Fall2005 -- DeHon 63

Branching Search

  • Explores entire state space

– finds optimum schedule

  • Exponential work

– O (N(resources*time-slots) )

  • Many schedules completely

uninteresting

CALTECH CS137 Fall2005 -- DeHon 64

Reducing Work

  • 1. Canonicalize “equivalent” schedule

configurations

  • 2. Identify “dominating” schedule

configurations

  • 3. Prune partial configurations which will

lead to worse (or unacceptable results)

CALTECH CS137 Fall2005 -- DeHon 65

“Equivalent” Schedules

  • If multiple resources of same type

– assignment of task to particular resource at a particular timeslot is not distinguishing

T1 T2 T3 T2 T1 T3

Keep track of resource usage by capacity at time-slot.

CALTECH CS137 Fall2005 -- DeHon 66

“Equivalent” Schedule Prefixes

T1 T3 T2 T4 T1 T3 T2 T4 T1 T2 T3 T4 T5 T6

slide-12
SLIDE 12

12

CALTECH CS137 Fall2005 -- DeHon 67

“Non-Equivalent” Schedule Prefixes

T1 T2 T3 T2 T3 T1 T1 T2 T3 T4 T5 T6

CALTECH CS137 Fall2005 -- DeHon 68

Pruning Prefixes?

  • I’m not sure there is an efficient way

(general)?

  • Keep track of schedule set

– walk through state-graph of scheduled prefixes – unfortunately, set is power-set so 2N – …but not all feasible, so shape of graph may simplify

CALTECH CS137 Fall2005 -- DeHon 69

Dominant Schedules

  • A strictly shorter schedule

– scheduling the same or more tasks – will always be superior to the longer schedule

T1 T2 T1 T2 T5 T4 T4 T5 T3 T3 T3 T2 T1 T5 T4

CALTECH CS137 Fall2005 -- DeHon 70

Pruning

  • If can establish a particular schedule

path will be worse than one we’ve already seen

– we can discard it w/out further exploration

  • In particular:

– LB=current schedule time + lower_bound_estimate – if LB greater than existing solution, prune

CALTECH CS137 Fall2005 -- DeHon 71

Pruning Techniques

Establish Lower Bound on schedule time

  • Critical Path (ASAP schedule)
  • Resource Bound
  • Critical Chain

CALTECH CS137 Fall2005 -- DeHon 72

“Critical Chain” Lower Bound

  • Bottleneck resource present coupled

resource and latency bound

Single red resource

slide-13
SLIDE 13

13

CALTECH CS137 Fall2005 -- DeHon 73

“Critical Chain” Lower Bound

  • Bottleneck resource present coupled

resource and latency bound

Single red resource

Critical path 5 Resource Bound (1,1) 4 Critical Chain (1,1) 7

CALTECH CS137 Fall2005 -- DeHon 74

Alpha-Beta Search

  • Generalization

– keep both upper and lower bound estimates on partial schedule

  • Lower bounds from CP, RB, CC
  • Upper bounds with List Scheduling

– expand most promising paths

  • (least upper bound, least lower bound)

– prune based on lower bounds exceeding known upper bound – (technique typically used in games/Chess)

CALTECH CS137 Fall2005 -- DeHon 75

Alpha-Beta

  • Each scheduling decision will tighten

– lower/upper bound estimates

  • Can choose to expand

– least current time (breadth first) – least lower bound remaining (depth first) – least lower bound estimate – least upper bound estimate

  • Can control greediness

– weighting lower/upper bound – selecting “most promising”

CALTECH CS137 Fall2005 -- DeHon 76

Note

  • Aggressive pruning and ordering

– can sometimes make polynomial time in practice – often cannot prove will be polynomial time – usually represents problem structure we still need to understand

CALTECH CS137 Fall2005 -- DeHon 77

Multiple Resources

  • Works for multiple resource case
  • Computing lower-bounds per resource

– resource constrained

  • Sometimes deal with resource coupling

– e.g. must have 1 A and 1 B simultaneously

  • r in fixed time slot relation
  • e.g. bus and memory port

CALTECH CS137 Fall2005 -- DeHon 78

Summary

  • Resource estimates and Refinement
  • SAT/ILP Schedule
  • Software Pipelining
  • Branch-and-bound search

– “equivalent” states – dominators – estimates/pruning

slide-14
SLIDE 14

14

CALTECH CS137 Fall2005 -- DeHon 79

Admin

  • Class

– Friday (no holiday) – Next week:

  • Monday, Friday
  • No Wed.
  • Next week’s reading all online

CALTECH CS137 Fall2005 -- DeHon 80

Big Ideas:

  • Estimate Resource Usage
  • Use dominators to reduce work
  • Techniques:

– Force-Directed – SAT/ILP – Coloring – Search

  • Branch-and-Bound
  • Alpha-Beta