Complexity-Effective Issue Queue Design Under Load-Hit Speculation - - PowerPoint PPT Presentation

complexity effective issue queue design under load hit
SMART_READER_LITE
LIVE PREVIEW

Complexity-Effective Issue Queue Design Under Load-Hit Speculation - - PowerPoint PPT Presentation

Complexity-Effective Issue Queue Design Under Load-Hit Speculation Tali Moreshet and R. Iris Bahar Brown University Division of Engineering Motivation Pipelines are getting deeper Higher clock frequencies Increased architectural


slide-1
SLIDE 1

Complexity-Effective Issue Queue Design Under Load-Hit Speculation

Tali Moreshet and R. Iris Bahar Brown University Division of Engineering

slide-2
SLIDE 2

Brown University

WCED 2002

Motivation

Pipelines are getting deeper

Higher clock frequencies Increased architectural complexity

Speculatively issued instructions are

particularly sensitive to pipeline depth

Branch prediction Load hit prediction

slide-3
SLIDE 3

Brown University

WCED 2002

Pipeline

Register File Functional Units Register Rename Unit Data Cache Instruction Cache Issue Queue

Load Resolution Loop

Fetch Decode Issue Execute

forwarding

slide-4
SLIDE 4

Brown University

WCED 2002

Load Hit Prediction

Issue instructions dependent on load as soon

as possible

  • Assume load hits in DL1

BUT…

Load hit status is known only after dependent

instructions may issue

slide-5
SLIDE 5

Brown University

WCED 2002

Example

Exec Exec Issue Exec Exec Exec

Cycle: 1 2 3 4 5 6 7 8 LOAD MULT SUB ADD

Issue Issue Issue

Speculative window

Exec

slide-6
SLIDE 6

Brown University

WCED 2002

Example

Exec Issue Exec Exec

Cycle: 1 2 3 4 5 6 7 8 9 LOAD ADD Speculative window

Exec Issue Issue Issue Exec

MULT SUB

Exec

slide-7
SLIDE 7

Brown University

WCED 2002

Example

Issue Exec Exec Exec

Cycle: 1 2 3 4 5 6 7 8 9 10 LOAD ADD

Exec Issue Issue Issue

Speculative window MULT SUB

Exec Exec

slide-8
SLIDE 8

Brown University

WCED 2002

What Happens On a Load Miss?

Re-issue instructions in speculative window

after a load miss

Keep post-issue instructions in issue queue

long enough to ensure re-issuing will not be necessary

slide-9
SLIDE 9

Brown University

WCED 2002

Complexity-Effective Load Hit Speculation

  • As pipeline depth increases:
  • Retain performance benefit
  • Consider complexity of re-issue and prediction

policies

  • Consider impact on issue queue design
slide-10
SLIDE 10

Brown University

WCED 2002

Re-Issue Policies

  • 4 different load hit speculation policies:

1)

No load hit speculation

2)

Perfect load hit speculation

3)

Replay only instructions dependent on load that missed

4)

Replay all instructions in speculative window

  • Load hit/miss predictor to limit re-issuing
slide-11
SLIDE 11

Brown University

WCED 2002

Performance Impact

  • 5%

0% 5% 10% 15% 20% 25% 30% 35% 40% 45%

Exe1 Exe3 Exe5 Exe7

Performance Increase from No Load Speculation

Perfect_Int Dep_Int Dep_Pred_Int Seq_Int Seq_Pred_Int Perfect_FP Dep_FP Dep_Pred_FP Seq_FP Seq_Pred_FP

slide-12
SLIDE 12

Brown University

WCED 2002

Impact on Issue Queue Occupancy

5 10 15 20 25 30 35 40

No Load Speculation, Integer Benchmarks No Load Speculation, Floating Point Benchmarks Dependent Load Speculation, Integer Benchmarks Dependent Load Speculation, Floating Point Benchmarks

Average Number of Instructions in the Issue Queue

pre-issue post-issue

slide-13
SLIDE 13

Brown University

WCED 2002

Impact on Issue Queue Occupancy

0% 10% 20% 30% 40% 50% 60% 70% Exe1 Exe3 Exe5 Exe7

Percentage of Post-Issue Instructions in the Issue Queue

compress ijpeg bzip Int_avg apsi swim art wupwise FP_avg

slide-14
SLIDE 14

Brown University

WCED 2002

Impact on Issue Queue Occupancy

As pipeline depth increases:

Issue queue gets cluttered with post-issue

instructions (average 55%)

Limits the available ILP Inefficient use of complexity in instruction

bid/grant arbitration logic

slide-15
SLIDE 15

Brown University

WCED 2002

The Bid / Grant Loop

Prioritize & Select M entries Issue Queue

req req req grant grant grant

N-wide

Bid for issue slot Broadcast grant

...

slide-16
SLIDE 16

Brown University

WCED 2002

Issue Queue Utilization Problem

Complexity of bid/grant arbitration logic

increases with size of the IQ

IQ consists largely of post-issue instructions Limiting the available ILP that a large IQ is

supposed to provide

  • Not a complexity-effective design
slide-17
SLIDE 17

Brown University

WCED 2002

IQ Design Options

Increase the IQ size

☺ Improve performance – increase available ILP Increase complexity

Simplify arbitration logic – use slower circuitry

☺ Reduce complexity Hurt performance

Reduce IQ size

☺ Reduce complexity Hurt performance

slide-18
SLIDE 18

Brown University

WCED 2002

Double Latency of Issue Queue

  • 70%
  • 60%
  • 50%
  • 40%
  • 30%
  • 20%
  • 10%

0%

Exe1 Exe3 Exe5 Exe7

Performance Increase From a 64 Entry Issue Queue, Dependent Load Speculation

compress ijpeg bzip Int_avg apsi swim art wupwize FP_avg

slide-19
SLIDE 19

Brown University

WCED 2002

Smaller IQ (48 Entry)

  • 25%
  • 20%
  • 15%
  • 10%
  • 5%

0% 5% Exe1 Exe3 Exe5 Exe7

Performance Increase From a 64 Entry Issue Queue, Dependent Load Speculation

compress ijpeg bzip Int_avg apsi swim art wupwise FP_avg

slide-20
SLIDE 20

Brown University

WCED 2002

Complexity-Effective Issue Queue

Goal

Reduce complexity Do not degrade performance

Solution: The Dual Issue Queue

Move post-issue instructions from main queue to

separate replay queue

Increase available ILP Reduce size of main IQ

slide-21
SLIDE 21

Brown University

WCED 2002

Dual Issue Queue

Register File Functional Units Register Rename Unit Data Cache Main Issue Queue Replay Issue Queue

from Fetch unit Replay_req MIQ RIQ

slide-22
SLIDE 22

Brown University

WCED 2002

Dual Issue Queue Performance

  • 8%
  • 6%
  • 4%
  • 2%

0% 2% 4% 6% 8% 10% Exe1 Exe3 Exe5 Exe7

Performance Increase From Standard Issue Queue, Dependent Load Speculation

compress ijpeg bzip Int_avg apsi swim art wupwise FP_avg

slide-23
SLIDE 23

Brown University

WCED 2002

Conclusion

Load hit speculation is critical for high

performance in deeper pipelines

Larger percentage of post-issue instructions

in issue queue

Complexity-effective issue queue scheme

addresses utilization problem

For deepest pipelines, overall performance

improves while reducing complexity of IQ