Kyle C. Hale Boris Grot Stephen W. Keckler December 12, 2009 - - PowerPoint PPT Presentation

kyle c hale boris grot stephen w keckler
SMART_READER_LITE
LIVE PREVIEW

Kyle C. Hale Boris Grot Stephen W. Keckler December 12, 2009 - - PowerPoint PPT Presentation

Kyle C. Hale Boris Grot Stephen W. Keckler December 12, 2009 Department of Computer Science The University of Texas at Austin Static Energy consumption due to leakage threatening to become dominant Network constitutes a substantial (up


slide-1
SLIDE 1

Kyle C. Hale Boris Grot Stephen W. Keckler

December 12, 2009

Department of Computer Science The University of Texas at Austin

slide-2
SLIDE 2

 Static Energy consumption due to leakage

threatening to become dominant

 Network constitutes a substantial (up to 36%)

portion of total chip energy [Kim et al., ISLPED ‘03]

 Abundant on-chip bandwidth often

underutilized due to low injection rates

2 Department of Computer Science The University of Texas at Austin

slide-3
SLIDE 3

 Background  Segment Gating  Methodology  Evaluation

  • Optimal Gating Scheme
  • Static Scheme with Random Segment Selection
  • Static Scheme with Intelligent Segment Selection
  • Dynamic Gating Scheme

 Future Work

Department of Computer Science The University of Texas at Austin 3

slide-4
SLIDE 4

 Gated-VDD (power gating) at various

granularities

 Power-aware buffer designs

[Chen & Peh, ISLPED ’03]

 Slow-Silent VCs (DVFS applied to links, power

gate the buffers) [Matsutani et al., NOCS ‘08]

Department of Computer Science The University of Texas at Austin 4

slide-5
SLIDE 5

 Aggressive gating of idle resources  Link

  • Driver, repeaters

 Router

  • All VC buffers and management logic
  • Xbar ports

 line drivers & switching elements

  • Allocators

 2-level stateful allocators

Department of Computer Science The University of Texas at Austin 5

slide-6
SLIDE 6

Department of Computer Science The University of Texas at Austin

N S E W R.C.

V.A. S.A.

N S E W R.C.

V.A. S.A.

Upstream Router Downstream Router

6

slide-7
SLIDE 7

 Background  Segment Gating  Methodology  Evaluation

  • Optimal Gating Scheme
  • Static Scheme with Random Segment Selection
  • Static Scheme with Intelligent Segment Selection
  • Dynamic Gating Scheme

 Future Work

Department of Computer Science The University of Texas at Austin 7

slide-8
SLIDE 8

 Three types of Gating Schemes

  • Optimal Gating (oracle, no cost)

 Upper bound on energy savings

  • Static Segment Gating

 Off-line decision on which segments to gate

  • Dynamic Segment Gating

 “On-line” decisions based on dynamic workload

 Evaluated via analytical approaches

  • Not simulation based (for now)
  • Effects of contention ignored

Department of Computer Science The University of Texas at Austin 8

slide-9
SLIDE 9

 No power-down/wake-up overheads (latency,

energy)

 Segment shuts down during any period of

inactivity and instantly wakes up on-demand

 Used as a baseline measurement for static

schemes

Department of Computer Science The University of Texas at Austin 9

slide-10
SLIDE 10

 Objective: Turn off a certain number of segments

before the run of the workload

 Measure impact of gating through effect on hop

counts

 Static analysis tool:

  • Represents mesh as directed graph
  • Hop counts derived from shortest path lengths between

communicating nodes

  • Shortest paths may be longer than min manhattan routes

 Segments turned off via stochastic process

  • Invariant: full connectivity maintained
  • Take multiple samples to generate a distribution

Department of Computer Science The University of Texas at Austin 10

slide-11
SLIDE 11

 Select segments at random to power down  Limits of static segment gating:

  • 161 segments (links) out of 224 gated in a 64-node

mesh

  • A gated segment remains in that state for rest of

workload

 For certain traffic patterns, a random decision

could lead to a bad choice

Department of Computer Science The University of Texas at Austin 11

slide-12
SLIDE 12

 Random selection ignores characteristics of

traffic patterns

 Instead, pick segments based on link utilization

  • For applications with communication regularity
  • Requires us to know communication pattern a priori

 Two-stage approach

  • Stage 1: Pick segments (links) with utilization zero

 92 for bit-complement  100 for transpose

  • Stage 2 (iterative):

 Turn off least-utilized segment  Recompute utilization based on new traffic flow

Department of Computer Science The University of Texas at Austin 12

slide-13
SLIDE 13

 Objective: Dynamically gate segments to

accommodate a changing workload

 PARSEC traces run through cycle-accurate

network simulator

  • Log each link’s idle/active periods

 Off-line analysis of activity logs

  • Combine with power model
  • Gate idle links
  • Wake up links on demand
  • Ignore contention and wake-up delays
  • Segments must be gated long enough to amortize

energy cost of wake-up & power-down

Department of Computer Science The University of Texas at Austin 13

slide-14
SLIDE 14

 Derived using ORION 2.0  Allocator energy: energy from 1st & 2nd level switch

and VC allocators combined

 Note: Buffers only account for 55% of leakage

energy!

Component Component Static ( Static (nJ nJ/cycle) /cycle) Dynamic ( Dynamic (nJ nJ/flit) /flit) Flit buffers 1.5 7.23 Crossbar 0.491 10.3 Allocators 0.215 0.7 Link 0.556 8.1

Department of Computer Science The University of Texas at Austin 14

slide-15
SLIDE 15

Topology 64-node mesh Channels 128 bits wide, 1 cycle/link Synthetic Workloads Uniform random, transpose, bit-

  • complement. Each workload comprises

1,000 packets injected by each node PARSEC traces Blackscholes, bodytrack, fluidanimate, vips,

  • x264. Sim-medium datasets

Router details 2-stage speculative pipeline, 5 ports, 4 VCs/port, 5 flits/VC

Department of Computer Science The University of Texas at Austin 15

slide-16
SLIDE 16

 Background  Segment Gating  Methodology  Evaluation

  • Optimal Gating Scheme
  • Static Scheme with Random Segment Selection
  • Static Scheme with Intelligent Segment Selection
  • Dynamic Gating Scheme

 Future Work

Department of Computer Science The University of Texas at Austin 16

slide-17
SLIDE 17

10 20 30 40 50 Uniform Random Transpose Bit Complement Total Energy ( Total Energy (µJ) J) Dynamic Leakage

Department of Computer Science The University of Texas at Austin 17

slide-18
SLIDE 18

 ~20 edges can be removed with a negligible hop count

increase

 2-4x increase in hop count with max # of segments gated

2 4 6 8 10 12 14 16 18 20 22 24 26 20 40 60 80 100 120 140 160 Hop Count Hop Count Number of Gated Segments Number of Gated Segments Mean Max Min

Department of Computer Science The University of Texas at Austin 18

slide-19
SLIDE 19

200 400 600 800 1,000 1,200 1,400 1,600 1,800 2,000 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% 0.1% 1.0% 5.0% 10.0% Total Energy ( Total Energy (µJ) J) Number of Gated Segments Number of Gated Segments Dynamic Leakage 0 40 80 120 161

Department of Computer Science The University of Texas at Austin 19

slide-20
SLIDE 20

Department of Computer Science The University of Texas at Austin 20

slide-21
SLIDE 21

 Two policies for idle period and break-even

point

  • Aggressive: 2 cycles idle, 10 cycles to break even
  • Conservative: 10 cycles idle, 50 cycles to break

even

Department of Computer Science The University of Texas at Austin 21

slide-22
SLIDE 22

Department of Computer Science The University of Texas at Austin 22

slide-23
SLIDE 23

 Detailed simulation and analysis  Consider performance impact and contention  Other network configurations  Clearly establish regimes that should use

Segment Gating

 Explore application to fault tolerant systems

Department of Computer Science The University of Texas at Austin 23

slide-24
SLIDE 24

 Real applications show sparse communication

so potential for static energy savings is high

 Using link utilization, we can minimize

dynamic energy incurred from gating segments statically

 Aggressive dynamic policy gives us static

energy savings for up to 99% of cycles

Department of Computer Science The University of Texas at Austin 24