Hardware-Software Codesign 11. Thermal-Aware Design Iuliana - - PowerPoint PPT Presentation

hardware software codesign 11 thermal aware design
SMART_READER_LITE
LIVE PREVIEW

Hardware-Software Codesign 11. Thermal-Aware Design Iuliana - - PowerPoint PPT Presentation

Hardware-Software Codesign 11. Thermal-Aware Design Iuliana Bacivarov & Lothar Thiele Swiss Federal Computer Engineering 11 - 1 Institute of Technology and Networks Laboratory Contents Why is it important to consider temperature in


slide-1
SLIDE 1

11 - 1 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Hardware-Software Codesign

  • 11. Thermal-Aware Design

Iuliana Bacivarov & Lothar Thiele

slide-2
SLIDE 2

11 - 2 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Contents

Why is it important to consider temperature in system design? Power and temperature models Thermal simulation Thermal-aware scheduling

slide-3
SLIDE 3

11 - 3 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Power/Thermal Wall

Power/Thermal wall is recognized as the most significant barrier towards high performance

slide-4
SLIDE 4

11 - 4 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi-Cores Face the Power/Thermal Wall Too

[Loh: 3D-Stacked Memory Architectures for Multi-Core Processors, 2008] 72-Core Intel Xeon Phi platform

slide-5
SLIDE 5

11 - 5 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Some Solutions

VLSI design and cooling solutions

  • Thermal-aware design, materials, reduce leakage and switching, ...
  • Use better heat sinks, fans, air cooling, liquid cooling

Thermal management

  • Voltage/frequency scaling
  • Stop-go execution

completely TURN OFF components to allow for cooling

  • Migration of tasks

from hot to cool area

[MJPEG decoder on 25-core processor] [source: Wikipedia]

slide-6
SLIDE 6

11 - 6 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

But scheduling of jobs and thermal management techniques affect both timing and thermal properties Thermal and performance objectives must be considered simultaneously during design

slide-7
SLIDE 7

11 - 7 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Some Design Questions

How can we simultaneously consider during design both timing and temperature? What is the worst case peak temperature of the chip? What is an optimal temperature-aware mapping scheme? What are temperature-aware scheduling techniques with low overhead (simple control, no temperature sensors)?

Thermal and performance objectives must be considered simultaneously during design

slide-8
SLIDE 8

11 - 8 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Contents

Why is it important to consider temperature in system design? Power and temperature models Thermal simulation Thermal-aware scheduling

slide-9
SLIDE 9

11 - 9 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Single Source Power Model

Frequently used power model for constant voltage Active processing Idle mode Silicon chip Single power source ) ( ) ( ) ( t t T t P ψ φ + ⋅ =

i i i a a a

t T t P t T t P t P ψ φ ψ φ + ⋅ = + ⋅ = = ) ( ) ( ) ( ) ( { ) (

, for active processing , for idle mode

Including both dynamic and leakage power Just leakage power temperature power

slide-10
SLIDE 10

11 - 10 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Power Model

  • Independent of temperature
  • Different power consumption for

every code segment

  • Separate power consumption for

each component (core, cache, memory, …)

  • Independent of temperature
  • Different power consumption for

every code segment

  • Separate power consumption for

each component (core, cache, memory, …)

  • Independent of the load
  • Depends on the current temperature of

the component

  • Model [Skadron et al. 2004]
  • For the remaining lecture: We use a

linear approximation (see above).

  • Independent of the load
  • Depends on the current temperature of

the component

  • Model [Skadron et al. 2004]
  • For the remaining lecture: We use a

linear approximation (see above). Dynamic power Dynamic power Leakage power Leakage power

T C Leak

e T P

/ 2

~

) ( ) ( ) ( t t T t P ψ φ + ⋅ =

i i i a a a

t T t P t T t P t P ψ φ ψ φ + ⋅ = + ⋅ = = ) ( ) ( ) ( ) ( { ) (

, for active processing , for idle mode

Just leakage power Including both dynamic and leakage power Active processing Idle mode

slide-11
SLIDE 11

11 - 11 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Static Power / Dynamic Power Ratio

slide-12
SLIDE 12

11 - 12 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Static Power and Dynamic Power

Dynamic power consumption:

Total capacity Supply voltage Clock frequency Between 0 and 1; quantifies switching activity

Static power consumption:

  • 20% or more in sub-micron era
  • Mostly leakage, i.e., the power dissipated by a transistor whose gate is

intended to be off

slide-13
SLIDE 13

11 - 13 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Single Power Source Model

Thermal conductance Environment temperature Power parameters

Silicon chip Cooling I ≅ P V ≅ T C G V0 ≅ Tamb

) ( ) ( ) ( t t T t P ψ φ + ⋅ =

Thermal capacity

Single power source

slide-14
SLIDE 14

11 - 14 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Solution of the Thermal Equation

Explicit solution

Steady state temperature

slide-15
SLIDE 15

11 - 15 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Temperature Profile

Active state: dynamic and static (leakage) power Temperature increase: based on linear thermal model Task execution schedule

slide-16
SLIDE 16

11 - 16 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Temperature Profile

Idle state: static (leakage) power Temperature decrease: based on linear thermal model Task execution schedule

slide-17
SLIDE 17

11 - 17 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Temperature Profile

Peak temperature Task execution schedule

slide-18
SLIDE 18

11 - 18 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi Source Models

Layout RC equivalent model

[Barcella et. Al., U. Virginia ]

  • A and B are matrixes
  • T is an N-dimensional temperature vector
  • u is the input vector
slide-19
SLIDE 19

11 - 19 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi Source Models – Solution

Explicit solution: Hij(t) is the impulse response between power injected at source j and temperature variation at location i

Impulse response matrix A, H, B, are matrixes T is an N-dimensional temperature vector

slide-20
SLIDE 20

11 - 20 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi-Core Effect

temperature

no delay

Self-heating effect temperature

delayed

Neighboring effect#3 temperature

delayed

Neighboring effect#2 temperature

delayed

Neighboring effect #1

slide-21
SLIDE 21

11 - 21 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

The Impulse Response

Temperature rises with power at same location (without delay) Temperature rises with power at some other location after delay

slide-22
SLIDE 22

11 - 22 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi-Core Effect – Heat Transfer (I)

u(t) =input vector

C = thermal capacitance matrix G =thermal conductance matrix K = thermal ground conductance matrix P = power dissipation vector Tamb = ambient temperature vector Tamb = Tamb  [1, . . . , 1]’ Power dissipated by component l

l

in ‘active’ (a) and ‘idle’ (i) processing modes Thermal model

slide-23
SLIDE 23

11 - 23 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi-Core Effect – Heat Transfer (II)

closed-form solution of the temperature

= impulse response between nodes l l and k = self-impulse response

slide-24
SLIDE 24

11 - 24 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi-Core Effect – Heat Transfer (III)

Closed-form solution of the temperature Temperature of node k Convolution between the impulse response Hkl and the input ul Workload of component l

slide-25
SLIDE 25

11 - 25 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multi-Core Effect

self-impulse response Hkk(τ-t)

slide-26
SLIDE 26

11 - 26 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Solving the Differential Equations

What’s happening in numerical simulations? Simulators

  • HotSpot http://lava.cs.virginia.edu/HotSpot/index.htm
  • 3D-Ice http://esl.epfl.ch/3d-ice.html

) ( ) ( ) ( t u B t T A dt t dT ⋅ + ⋅ = ) ( ) ( ) (

1 1 − −

⋅ + ⋅ = Δ Δ

k k

t u B t T A t t T

[ ]

t t u B t T A t T t T

k k k k

Δ ⋅ ⋅ + ⋅ + =

− − −

) ( ) ( ) ( ) (

1 1 1 amb

T t T = ) ( 0

Temperature of interest Constant time interval With P(t) = P = const. for 0 ≤ t ≤ Δt, (and therefore u(t) = const.)

) ( ) ( ) (

1 −

− = Δ

k k

t T t T t T

slide-27
SLIDE 27

11 - 27 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Contents

Why is it important to consider temperature in system design? Power and temperature models Thermal simulation Thermal-aware scheduling

slide-28
SLIDE 28

11 - 28 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Thermal Simulation Tool-Chain

Power / Performance Simulator Power / Performance Simulator Temperature Simulator Temperature Simulator

0.5 1 1.5 2 2.5 300 302 304 306 Time [s] Temperature [K]

Low-level power/performance simulation/emulation

  • Software: [Benini’05], [Brooks’00]
  • Hardware: [Atienza’07]

Temperature simulation

  • HotSpot: [Huang’06]
  • 3DICE: [Sridhar’10]

There are other possibilities as well, e.g. model identification and reduction

Power models

  • f HW components

Modeling

  • f physical structure

Application

slide-29
SLIDE 29

11 - 29 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

High-Level Power Simulation

Computing: Δt = 20ms, ΔP = 25mW Reading: Δt = 45ms, ΔP = 32mW Computing: Δt = 80ms, ΔP = 38mW Writing: Δt = 60ms, ΔP = 34mW Computing: Δt = 120ms, ΔP = 41mW

1

int fire () {

2

float i = 0;

3

float j = 0;

4 5

read (PORT_IN, &i);

6 7

j = i*i;

8

j += 2;

9 10

write (PORT_OUT, &j);

11 12

printf(“Wrote: %f\n, j);

13

return 0;

14 }

How do we consider computation, communication, and memory? How do we link power consumption, time, and temperature? How do we consider scheduling?

slide-30
SLIDE 30

11 - 30 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

High-Level Power Simulation (I)

procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure Process Model Process Model Power Consumption Power Consumption

slide-31
SLIDE 31

11 - 31 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

High-Level Power Simulation (II)

procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure Process Model Process Model Power Consumption Power Consumption

slide-32
SLIDE 32

11 - 32 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

High-Level Power Simulation (III)

procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure Process Model Process Model Power Consumption Power Consumption

slide-33
SLIDE 33

11 - 33 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

High-Level Power Simulation (IV)

procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure procedure FIRE(Process p) read(INPUT, size, buf) manipulate write(OUTPUT, size, buf) end procedure Process Model Process Model Power Consumption Power Consumption

slide-34
SLIDE 34

11 - 34 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Thermal Evaluation

P(t) = P = const, 0 ≤ t ≤ Δt Δt = const Temperature of interest: T(Δt) Calculate E, F once at the beginning Power Annotation Scheduling Creation

Time Tile 1 Tile 2 5ms s1,p2 s1,p1 10ms 15ms Idle 20ms s2,p2 s1,p3 25ms Time Tile 1 Tile 2 5ms 26mW 29mW 10ms 15ms 5mW 20ms 32mW 23mW 25ms

Power Model Power Model Thermal Model Thermal Model

slide-35
SLIDE 35

11 - 35 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Model Data

Entity Parameter [Unit] Source Code segment Execution time [sec / iteration] Low-level sim. Power consumption [W] Low-level sim. Communication queue Token size [bytes / access] Functional sim. Write rate, Read rate [1] Functional sim. Processing unit Clock frequency [cycles / sec] Hardware data- sheet Architecture floor-plan Capacitance matrix [J/K] Low-level phy. sim. Conductivity matrix [W/K] Low-level phy. sim.

slide-36
SLIDE 36

11 - 36 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Calibration Tool Chain

Timing Parameters Thermal Parameters

Execution trace Timing characterization

Software Synthesis Software Synthesis Low-Level Power/Timing Simulator Low-Level Power/Timing Simulator Thermal Architecture Analysis Thermal Architecture Analysis Sample Mappings Sample Mappings

Power characterization Thermal platform model: conductivity matrix capacitance matrix

slide-37
SLIDE 37

11 - 37 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

(High-Level) Abstract Thermal Simulation

Idle Task? Store? Restore?

Application Application Scheduling Overhead Scheduling Overhead

slide-38
SLIDE 38

11 - 38 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Data for High-Level (Abstract) Thermal Simulation

Computing: Δt = 20ms, ΔP = 25mW Reading: Δt = 45ms, ΔP = 32mW Computing: Δt = 80ms, ΔP = 38mW Writing: Δt = 60ms, ΔP = 34mW Computing: Δt = 120ms, ΔP = 41mW

1

int fire () {

2

float i;

3

float j;

4 5

read (PORT_IN, &i);

6 7

j = i*i;

8

j += 2;

9 10

write (PORT_OUT, &j);

11 12

printf(“Wrote: %f\n, j);

13

return 0;

14 }

Thermal / Timing Parameters

Analyze hundreds of design alternatives very quickly!

slide-39
SLIDE 39

11 - 39 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Thermal Simulation Tool Chains

Black-Box Black-Box Power / Performance Simulator Power / Performance Simulator Temperature Simulator Temperature Simulator

0.5 1 1.5 2 2.5 300 302 304 306 Time [s] Temperature [K]

Power models

  • f HW components

Modeling

  • f physical structure

Application Application, Architecture, Mapping

0.5 1 1.5 2 2.5 300 302 304 306 Time [s] Temperature [K]

slide-40
SLIDE 40

11 - 40 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Contents

Why is it important to consider temperature in system design? Power and temperature models Thermal simulation Thermal-aware scheduling

slide-41
SLIDE 41

11 - 41 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Multiple Power States

Power states: trade-off between power consumption and performance

  • 1. Mobile consumer devices
  • 2. Server-grade hardware

Source: Dell Power Solutions, Feb. 2007 Source: Windows 7 power management

How to reduce chip temperature without sacrificing performance/timing requirements?

slide-42
SLIDE 42

11 - 42 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Thermal Control Loop

Reactive Speed Scaling (RSS) Speed of processor feedback controlled based on temperature Higher temperature ⇒ lower speed Joint analysis of temperature and timing complex (but possible) Temperature sensor can be replaced or extended by a load sensor

slide-43
SLIDE 43

11 - 43 Swiss Federal Institute of Technology Computer Engineering and Networks Laboratory

Timely behavior is important in embedded systems, such as avionics, automotive, or media processing

  • Tasks must finish execution within specified deadlines

Thermal wall is recognized as significant barrier to high performance

  • High chip temperatures lead to reliability issues, even higher

power consumption, and lower performance.

System-level design solutions

  • Use Dynamic Thermal Management (DTM) to reduce chip

temperatures, examples: speed scaling, stop-go scheduling, mapping and migration

  • But without sacrificing timing requirements

Summary