CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April - - PowerPoint PPT Presentation

csee 3827 fundamentals of computer systems
SMART_READER_LITE
LIVE PREVIEW

CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April - - PowerPoint PPT Presentation

CSEE 3827: Fundamentals of Computer Systems Lecture 21 and 22 April 22 and 27, 2009 Martha Kim martha@cs.columbia.edu Amdahls Law Be aware when optimizing. . . T + T T = affected improved unaffected improvement factor


slide-1
SLIDE 1

CSEE 3827: Fundamentals of Computer Systems

Lecture 21 and 22 April 22 and 27, 2009 Martha Kim martha@cs.columbia.edu

slide-2
SLIDE 2

CSEE 3827, Spring 2009 Martha Kim

Amdahl’s Law

2

Be aware when optimizing. . .

T =

improved

T improvement factor + T

unaffected

Example: On machine A, multiplication accounts for 80s out of 100s total CPU time. How much improvement in multiplication performance to get 5x speedup overall? Corollary: make the common case fast

affected

slide-3
SLIDE 3

CSEE 3827, Spring 2009 Martha Kim

Single-Cycle CPU Performance Issues

  • Longest delay determines clock period
  • Critical path: load instruction
  • instruction memory → register file → ALU → data memory → register file
  • Not feasible to vary clock period for different instructions
  • We will improve performance by pipelining

3

slide-4
SLIDE 4

CSEE 3827, Spring 2009 Martha Kim

Pipelining Laundry Analogy

4

฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀

slide-5
SLIDE 5

CSEE 3827, Spring 2009 Martha Kim

MIPS Pipeline

  • Five stages, one step per stage
  • IF: Instruction fetch from memory
  • ID: Instruction decode and register read
  • EX: Execute operation or calculate address
  • MEM: Access memory operand
  • WB: Write result back to register

5

slide-6
SLIDE 6

CSEE 3827, Spring 2009 Martha Kim

MIPS Pipeline Illustration 1

6

฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀

  • ฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀฀ ฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀

  • ฀฀฀
slide-7
SLIDE 7

CSEE 3827, Spring 2009 Martha Kim

MIPS Pipeline Illustration 2

7

฀฀ ฀฀ ฀฀ ฀฀฀

฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀ ฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀ ฀ ฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀ ฀ ฀฀฀฀฀฀฀฀฀

slide-8
SLIDE 8

CSEE 3827, Spring 2009 Martha Kim

Pipeline Performance 1

  • Assume time for stages is
  • 100ps for register read or write
  • 200ps for other stages
  • Compare pipelined datapath to single-cycle datapath

8

Instr IF ID EX MEM WB Total (PS) lw 200 100 200 200 100 800 sw 200 100 200 200 700 R-format 200 100 200 100 600 beq 200 100 200 500

slide-9
SLIDE 9

CSEE 3827, Spring 2009 Martha Kim

Pipeline Performance 2

9 ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀

฀฀฀ ฀฀฀ ฀฀฀

฀ ฀

฀฀฀ ฀฀฀ ฀฀฀

฀ ฀ ฀ ฀ ฀ ฀

  • Single-cycle Tclock = 800ps

Pipelined Tclock = 200ps

slide-10
SLIDE 10

CSEE 3827, Spring 2009 Martha Kim

Pipeline Speedup

  • Speedup due to increased throughput.
  • If all stages are balanced (i.e., all take the same time)
  • If not balanced, speedup is less

10

Pipeline instr. completion rate = Single-cycle instr. completion rate * Number of stages

slide-11
SLIDE 11

CSEE 3827, Spring 2009 Martha Kim

Hazard

  • A hazard is a situation that prevents starting the next instruction in the next

cycle

  • Structure hazards occur when a required resource is busy
  • Data hazards occur when an instruction needs to wait for an earlier

instruction to complete its data write

  • Control hazards occur when the control action (i.e., next instruction to fetch)

depends on a value that is not yet ready

11

slide-12
SLIDE 12

CSEE 3827, Spring 2009 Martha Kim

Structure Hazard

  • Conflict for use of a resource
  • In a MIPS pipeline with a single memory
  • Load/store requires memory access
  • Instruction fetch would have to stall for that cycle
  • This introduces a pipeline bubble
  • Hence, pipelined datapaths require separate instruction and data memories

(or separate instruction and data caches)

12

slide-13
SLIDE 13

CSEE 3827, Spring 2009 Martha Kim

Data Hazards

  • An instruction depends on completion of data access by a previous

instruction

13

add $s0, $t0, $t1 sub $t2, $s0, $t3

slide-14
SLIDE 14

CSEE 3827, Spring 2009 Martha Kim

Forwarding (aka Bypassing)

  • Use result when it is computed
  • Don’t wait for it to be stored in a register
  • Requires extra connections in the datapath

14

  • ฀฀฀

฀฀฀฀

  • ฀฀฀฀฀฀

฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

slide-15
SLIDE 15

CSEE 3827, Spring 2009 Martha Kim

Load-Use Data Hazard

  • Can’t always avoid stalls by forwarding
  • If value not computed when needed
  • Can’t forward backward in time!

15

  • ฀฀฀฀฀฀฀฀฀฀฀฀฀

฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀

  • ฀฀

฀฀฀฀

slide-16
SLIDE 16

CSEE 3827, Spring 2009 Martha Kim

Code Scheduling to Avoid Stalls

  • Reorder code to avoid use of load result in the next instruction

16

lw $t1, 0($t0) lw $t2, 4($t0) add $t3, $t1, $t2 sw $t3, 12($t0) lw $t4, 8($t0) add $t5, $t1, $t4 sw $t5, 16($t0)

MIPS assembly code for A = B + E; C = B + F;

stall stall

lw $t1, 0($t0) lw $t2, 4($t0) lw $t4, 8($t0) add $t3, $t1, $t2 sw $t3, 12($t0) add $t5, $t1, $t4 sw $t5, 16($t0)

13 cycles 11 cycles

slide-17
SLIDE 17

CSEE 3827, Spring 2009 Martha Kim

Control Hazards

  • Branch determines flow of control
  • Fetching next instruction depends on branch outcome
  • Pipeline can’t always fetch correct instruction
  • Still working on ID stage of branch
  • In MIPS pipeline
  • Need to compare registers and compute target early in the pipeline
  • Add hardware to do it in ID stage (See Sec. 4.8)

17

slide-18
SLIDE 18

CSEE 3827, Spring 2009 Martha Kim

Stall on Branch

  • Wait until branch outcome determined before fetching next instruction

18

฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

  • ฀฀฀

฀฀฀ ฀฀฀

slide-19
SLIDE 19

CSEE 3827, Spring 2009 Martha Kim

Branch Prediction

  • Longer pipelines can’t readily determine branch outcome early
  • Stall penalty becomes unacceptable
  • Predict outcome of branch
  • Only stall if prediction is wrong
  • In MIPS pipeline
  • Can predict branches not taken
  • Fetch instruction after branch, with no delay

19

slide-20
SLIDE 20

CSEE 3827, Spring 2009 Martha Kim

MIPS with Predict Not Taken

20

฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀ ฀฀฀ ฀฀฀ ฀฀฀฀

฀ ฀฀฀ ฀฀฀ ฀฀฀

prediction correct prediction incorrect

slide-21
SLIDE 21

CSEE 3827, Spring 2009 Martha Kim

More-Realistic Branch Prediction

  • Static branch prediction
  • Based on typical branch behavior
  • Example: loop and if-statement branches
  • Predict backward branches taken
  • Predict forward branches not taken
  • Dynamic branch prediction
  • Hardware measures actual branch behavior
  • e.g., record recent history of each branch
  • Assume future behavior will continue the trend
  • When wrong, stall while re-fetching, and update history

21

slide-22
SLIDE 22

CSEE 3827, Spring 2009 Martha Kim

Pipeline Summary

  • Pipelining improves performance by increasing instruction throughput
  • Executes multiple instructions in parallel
  • Each instruction has the same latency
  • Subject to hazards
  • Structure, data, control
  • Instruction set design affects complexity of pipeline implementation

22

slide-23
SLIDE 23

MIPS Pipelined Datapath

slide-24
SLIDE 24

CSEE 3827, Spring 2009 Martha Kim

MIPS Pipelined Datapath

24 ฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀

฀฀ ฀฀ ฀฀ ฀ ฀

  • ฀฀

฀฀

  • MEM

WB Right-to- left flow leads to hazards

slide-25
SLIDE 25

CSEE 3827, Spring 2009 Martha Kim

Pipeline registers

  • Need registers between stages, to hold information produced in previous

cycle

25 ฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀

slide-26
SLIDE 26

CSEE 3827, Spring 2009 Martha Kim

IF for Load

26

slide-27
SLIDE 27

CSEE 3827, Spring 2009 Martha Kim

ID for Load

27

slide-28
SLIDE 28

CSEE 3827, Spring 2009 Martha Kim

EX for Load

28

slide-29
SLIDE 29

CSEE 3827, Spring 2009 Martha Kim

MEM for Load

29

slide-30
SLIDE 30

CSEE 3827, Spring 2009 Martha Kim

WB for Load

30

wrong register number!

slide-31
SLIDE 31

CSEE 3827, Spring 2009 Martha Kim

Corrected Datapath for Load

31

฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀

  • (A single-cycle pipeline diagram)
slide-32
SLIDE 32

CSEE 3827, Spring 2009 Martha Kim

Pipeline Operation

  • Cycle-by-cycle flow of instructions through the pipelined datapath
  • “Single-clock-cycle” pipeline diagram
  • Shows pipeline usage in a single cycle
  • Highlight resources used
  • c.f. “multi-clock-cycle” diagram
  • Graph of operation over time
  • We’ll look at “single-clock-cycle” diagrams for load

32

slide-33
SLIDE 33

CSEE 3827, Spring 2009 Martha Kim

Multi-Cycle Pipeline Diagram 1

  • Form showing resource usage over time

33

฀฀฀฀฀฀฀฀ ฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀

฀฀ ฀฀฀฀ ฀฀฀ ฀฀ ฀฀฀ ฀฀฀

฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀

slide-34
SLIDE 34

CSEE 3827, Spring 2009 Martha Kim

Multi-Cycle Pipeline Diagram 2

  • Traditional form

34

฀฀฀ ฀฀฀฀

฀฀ ฀฀฀ ฀฀฀ ฀฀ ฀฀฀ ฀฀฀

฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀

slide-35
SLIDE 35

CSEE 3827, Spring 2009 Martha Kim

Single-Cycle Pipeline Diagram

  • State of pipeline in a given cycle

35

฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

  • ฀฀฀
  • ฀฀
  • ฀฀฀

฀ ฀฀฀ ฀ ฀฀฀

slide-36
SLIDE 36

CSEE 3827, Spring 2009 Martha Kim

Pipelined Control (Simplified)

36

฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀ ฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

slide-37
SLIDE 37

CSEE 3827, Spring 2009 Martha Kim

Pipelined Control Scheme

  • Control signals derived from instruction
  • As in single-cycle implementation

37 ฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀

slide-38
SLIDE 38

CSEE 3827, Spring 2009 Martha Kim

Pipeline Control Values

  • Control signals are conceptually the same as they were in the single cycle

CPU.

  • ALU Control is the same.
  • Main control also unchanged. Table below shows same control signals

grouped by pipeline stage

38

  • ฀฀

฀ ฀฀ ฀ ฀ ฀

  • ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

slide-39
SLIDE 39

CSEE 3827, Spring 2009 Martha Kim

Controlled Pipelined CPU

39

  • ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀

slide-40
SLIDE 40

CSEE 3827, Spring 2009 Martha Kim

Data Hazards in ALU Instructions

  • Consider this instruction sequence:
  • We can resolve hazards with forwarding
  • How do we detect when to forward?

40

sub $2,$1,$3 and $12,$2,$5

  • r $13,$6,$2

add $14,$2,$2 sw $15,100($2)

slide-41
SLIDE 41

CSEE 3827, Spring 2009 Martha Kim

Dependencies & Forwarding

41

฀฀฀ ฀฀฀฀ ฀฀฀ ฀฀฀ ฀฀ ฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀

฀ ฀ ฀ ฀ ฀

  • ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀

slide-42
SLIDE 42

CSEE 3827, Spring 2009 Martha Kim

Detecting the Need to Forward

  • Pass register numbers along pipeline
  • e.g., ID/EX.RegisterRs = register number for Rs sitting in ID/EX pipeline

register

  • ALU operand register numbers in EX stage are given by ID/EX.RegisterRs,

ID/EX.RegisterRt

  • Data hazards when

42

  • 1a. EX/MEM.RegisterRd = ID/EX.RegisterRs
  • 1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
  • 2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
  • 2b. MEM/WB.RegisterRd = ID/EX.RegisterRt

Fwd from EX/MEM pipeline reg Fwd from MEM/WB pipeline reg

slide-43
SLIDE 43

CSEE 3827, Spring 2009 Martha Kim

Detecting the Need to Forward 2

  • But only if forwarding instruction will write to a register other than $zero!
  • EX/MEM.RegWrite, MEM/WB.RegWrite
  • EX/MEM.RegisterRd ≠ 0,

MEM/WB.RegisterRd ≠ 0

43

slide-44
SLIDE 44

CSEE 3827, Spring 2009 Martha Kim

Simplified Pipeline w. No Forwarding

44

slide-45
SLIDE 45

CSEE 3827, Spring 2009 Martha Kim

Simplified Pipeline w. Forwarding Paths

45

slide-46
SLIDE 46

CSEE 3827, Spring 2009 Martha Kim

Simplified Pipeline w. Forwarding Paths 1

46

keep track of register sources/targets for in-flight instructions

slide-47
SLIDE 47

CSEE 3827, Spring 2009 Martha Kim

Simplified Pipeline w. Forwarding Paths 2

47

  • ption of routing previously calculated values directly to ALU
slide-48
SLIDE 48

CSEE 3827, Spring 2009 Martha Kim

Simplified Pipeline w. Forwarding Paths 3

48

Operand forwarding (aka register bypass) controlled by forwarding unit

slide-49
SLIDE 49

CSEE 3827, Spring 2009 Martha Kim

Forwarding Conditions

  • EX hazard
  • if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)

and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10

  • if (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0)

and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10

  • MEM hazard
  • if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)

and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01

  • if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)

and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

49

slide-50
SLIDE 50

CSEE 3827, Spring 2009 Martha Kim

Double Data Hazard

  • Consider the sequence:
  • Both hazards occur
  • Want to use the most recent
  • Revise MEM hazard condition
  • Only fwd if EX hazard condition isn’t true

50

add $1,$1,$2 add $1,$1,$3 add $1,$1,$4

slide-51
SLIDE 51

CSEE 3827, Spring 2009 Martha Kim

Revised Forwarding Condition

  • MEM hazard
  • if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)

and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01

  • if (MEM/WB.RegWrite and (MEM/WB.RegisterRd ≠ 0)

and not (EX/MEM.RegWrite and (EX/MEM.RegisterRd ≠ 0) and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01

51

Condition for EX hazard on RegisterRs Condition for EX hazard on RegisterRt

slide-52
SLIDE 52

CSEE 3827, Spring 2009 Martha Kim

Datapath with Forwarding

52

฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

slide-53
SLIDE 53

CSEE 3827, Spring 2009 Martha Kim

Load-Use Data Hazard

53 ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀

฀฀ ฀฀฀ ฀฀฀฀ ฀฀฀฀ ฀฀฀ ฀฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀

  • Need to stall for one cycle
slide-54
SLIDE 54

CSEE 3827, Spring 2009 Martha Kim

Load-Use Hazard Detection

  • Check when using instruction is decoded in ID stage
  • ALU operand register numbers in ID stage are given by
  • IF/ID.RegisterRs, IF/ID.RegisterRt
  • Load-use hazard when
  • If detected, stall and insert bubble

54

ID/EX.MemRead and ((ID/EX.RegisterRt = IF/ID.RegisterRs)

  • r

(ID/EX.RegisterRt = IF/ID.RegisterRt))

slide-55
SLIDE 55

CSEE 3827, Spring 2009 Martha Kim

How to Stall the Pipeline

  • Force control values in ID/EX register

to 0

  • Prevent update of PC and IF/ID register
  • Using instruction is decoded again
  • Following instruction is fetched again
  • 1-cycle stall allows MEM to read data for lw
  • Can subsequently forward to EX stage

55

slide-56
SLIDE 56

CSEE 3827, Spring 2009 Martha Kim

Stall/Bubble in the Pipeline

56 ฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀

฀฀ ฀฀ ฀฀฀ ฀฀฀฀ ฀฀฀ ฀฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀

  • Stall inserted here
slide-57
SLIDE 57

CSEE 3827, Spring 2009 Martha Kim

Stall/Bubble in the Pipeline

57 ฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀

฀฀ ฀฀ ฀฀฀ ฀฀฀฀ ฀฀฀ ฀฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀

  • Stall inserted here
slide-58
SLIDE 58

CSEE 3827, Spring 2009 Martha Kim

Datapath with Hazard Detection

58

฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀

slide-59
SLIDE 59

CSEE 3827, Spring 2009 Martha Kim

Stalls and Performance

  • Stalls reduce performance
  • But are required to get correct results
  • Compiler can arrange code to avoid hazards and stalls
  • Requires knowledge of the pipeline structure

59

slide-60
SLIDE 60

CSEE 3827, Spring 2009 Martha Kim

Branch Hazards

  • Determine branch outcome and target as early as possible
  • Move hardware to determine outcome to ID stage
  • Target address adder
  • Register comparator

60

slide-61
SLIDE 61

CSEE 3827, Spring 2009 Martha Kim

Branch Taken 1

61

slide-62
SLIDE 62

CSEE 3827, Spring 2009 Martha Kim

Branch Taken 2

62

slide-63
SLIDE 63

CSEE 3827, Spring 2009 Martha Kim

  • If a comparison register is a destination of 2nd or 3rd preceding ALU

instruction → can resolve using forwarding

Data Hazards for Branches 1

63

IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB IF ID EX MEM WB

add $4, $5, $6 add $1, $2, $3 beq $1, $4, target

slide-64
SLIDE 64

CSEE 3827, Spring 2009 Martha Kim

  • If a comparison register is a destination of preceding ALU instruction or 2nd

preceding load instruction → need 1 stall cycle

Data Hazards for Branches 2

64

beq stalled

IF ID EX MEM WB IF ID EX MEM WB IF ID ID EX MEM WB

add $4, $5, $6 lw $1, addr beq $1, $4, target

slide-65
SLIDE 65

CSEE 3827, Spring 2009 Martha Kim

Data Hazards for Branches 3

  • If a comparison register is a destination of immediately preceding load

instruction → need 2 stall cycles

65

beq stalled

IF ID EX MEM WB IF ID ID ID EX MEM WB

beq stalled lw $1, addr beq $1, $0, target

slide-66
SLIDE 66

CSEE 3827, Spring 2009 Martha Kim

Exceptions and Interrupts

  • “Unexpected” events requiring change

in flow of control

  • Exception
  • Arises within the CPU (e.g., undefined opcode, overflow, syscall, …)
  • Interrupt
  • From an external I/O controller
  • Dealing with them without sacrificing performance is hard

66

slide-67
SLIDE 67

CSEE 3827, Spring 2009 Martha Kim

Handling Exceptions

  • In MIPS, exceptions managed by a System Control Coprocessor (CP0)
  • Save PC of offending (or interrupted) instruction
  • In MIPS: Exception Program Counter (EPC)
  • Save indication of the problem
  • In MIPS: Cause register
  • We’ll assume 1-bit
  • 0 for undefined opcode, 1 for overflow
  • Jump to handler at 8000 00180

67

slide-68
SLIDE 68

CSEE 3827, Spring 2009 Martha Kim

Handler Actions

  • Read cause, and transfer to relevant handler
  • Determine action required
  • If restartable
  • Take corrective action
  • use EPC to return to program
  • Otherwise
  • Terminate program
  • Report error using EPC, cause, …

68

slide-69
SLIDE 69

CSEE 3827, Spring 2009 Martha Kim

Exceptions in a Pipeline

  • Another form of control hazard
  • Consider overflow on add in EX stage
  • add $1, $2, $1
  • Prevent $1 from being clobbered
  • Complete previous instructions
  • Flush add and subsequent instructions
  • Set Cause and EPC register values
  • Transfer control to handler
  • Similar to mispredicted branch
  • Use much of the same hardware

69