Design and Architectures for Embedded Systems Prof. Dr. J. Henkel - - PowerPoint PPT Presentation

design and architectures for embedded systems
SMART_READER_LITE
LIVE PREVIEW

Design and Architectures for Embedded Systems Prof. Dr. J. Henkel - - PowerPoint PPT Presentation

Design and Architectures for Embedded Systems Prof. Dr. J. Henkel Henkel Prof. Dr. J. CES - - Chair for Embedded Systems Chair for Embedded Systems CES University of Karlsruhe, Germany University of Karlsruhe, Germany Today: Designing Low


slide-1
SLIDE 1
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Design and Architectures for Embedded Systems

  • Prof. Dr. J.
  • Prof. Dr. J. Henkel

Henkel CES CES -

  • Chair for Embedded Systems

Chair for Embedded Systems University of Karlsruhe, Germany University of Karlsruhe, Germany

Today: Designing Low Power Embedded Today: Designing Low Power Embedded Systems Systems

slide-2
SLIDE 2
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Where are we ?

  • Emb. Software

Optimization for:

  • low power
  • Performance
  • Area, …

Embedded Processor Design

  • extens. Instruction
  • Parameterization

Integration Hardware Design

  • synthesis

Middleware, RTOS

  • Scheduling

System specification Design space exploration

  • low power
  • Performance
  • Area

System partitioning

  • models of computation
  • Spec languages

Estimation&Simulation

  • low power
  • performance
  • Area, …

Tape out Prototyping

embedded IP:

  • PEs
  • Memories
  • Communication
  • Peripherals

IC technology

Optimization

  • low power
  • performance
  • Area, …

refine

slide-3
SLIDE 3
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de (Src: F. Pollack, Intel

slide-4
SLIDE 4
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Low Power/Energy: Reason

  • Portable Systems
  • Notebooks, palm-tops, PDA,

cellular phones, pagers, etc.

  • 32% of PC market, and growing
  • Battery-driven - long battery life

crucial

  • System cost, weight limited by

batteries

  • 40W, 10 hrs @ 20-35 W-

hr/pound = 7-20 pounds

  • Slow growth in battery

technology

  • Must reduce energy drain

from batteries

  • Thermal Considerations
  • 10 oC increase in operating

temperature => component failure rate doubles

  • Packaging: ceramic vs. plastic
  • Cooling requirements
  • Increasing levels of

integration / clock frequencies make the problem worse

  • 10cm2, 500 MHz => 315Watts
  • Reliability Issues
  • Electro-migration
  • IR drops on supply lines
  • Inductive effects
  • Tied to peak/average

power consumption

  • Environmental Concerns
  • EPA estimate: 80% of office

equipment electricity is used in computers

  • “Energy Star” program to

recognize power efficient PCs

  • Power management standard

for desktops and laptops

  • Drive towards “Green PC”

LOW POWER (Src: A. Raghunathan, NEC)

slide-5
SLIDE 5
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power Consumption and flexibility

Operations/Watt [MOPS/mW] Processors Reconfigurable Computing hardwired (ASIC) 1 0.1 0.01 Technology Ambient Intelligence DSP-ASIPs µPs 10 poor design generation techniques 0.13µ 0.07µ 1.0µ 0.5µ 0.25µ

(Src:[Marw03])

slide-6
SLIDE 6
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power Sources

Energy Source Power/Energy Density Batteries (Zinc-Air, primary) 1050-1560 mWh/cm3 Batteries (Li, rechargeable) 300 mWh/cm3 Solar (outdoors) 15 mW/cm2 (direct sun) 1 mW/cm2 (24 hour avg) Non-inertial Human Power (shoe inserts) 1.8 mW Solar (indoors) 0.006 mW/cm2 (office desk) 0.57mW/cm2 (<60W desk lamp) Vibrations 0.01-0.1 mW/cm3 Acoustic (noise) 3 e-6 mW/cm2 @ 75dB 9.6 e-4 mW/cm2 @ 100dB Nuclear Reaction 80 mW/cm3, 1 e+6 mWh/cm3

(Src: A. Raghunathan, NEC)

slide-7
SLIDE 7
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Relationship between Power and Energy

P E t

= dt P E

slide-8
SLIDE 8
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power vs. Energy

Minimizing the power consumption is important for

the design of the power supply the design of voltage regulators the dimensioning of interconnect short term cooling

Minimizing the energy consumption:

Limited availability of energy (mobile systems, try to maximize the amount of computation that can be accomplished with a given amount of energy) through: limited battery capacities (only slowly improving) very high costs of energy (solar panels, in space) cooling high costs limited space dependability long lifetimes, low temperatures

(Src:[Marw03])

slide-9
SLIDE 9
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power by Processor Type

Pentium Crusoe

Pentium 4 Crusoe Processor

(source: www.transmeta.com)

slide-10
SLIDE 10
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Outline

  • System

System-

  • level Design Methodologies

level Design Methodologies

  • System

System-

  • level power estimation

level power estimation

  • Low

Low-

  • power Embedded Software

power Embedded Software

  • System

System-

  • level Tradeoffs and Power Management

level Tradeoffs and Power Management

(src: A. Raghunathan, NEC)

slide-11
SLIDE 11

ALGORITHMIC CO-VALIDATION

Cores

MPU, MCU,DSP

DFT & TEST GENERATION DFT & TEST GENERATION CPU CORE ROM/ RAM Periph. PCI/ MPEG

SYSTEM SPEC

HARD CORES SOFT CORES S/W TASKS CPU App. Spec H/W MAPPING / PARTITIONING S/W IMPLEMENTATION H/W SYNTHESIS PROTOTYPE VERIFICATION UDL FULL TIMING MODELS FULL TIMING VERIFICATION CYCLE BASED

  • ISS
  • HDL

ARCHITECTURAL CO-VALIDATION C-MODELS, HDL- MODELS ESTIMATORS

  • TIMING
  • POWER

ASIC INTEGRATION BUS ARCHITECTURE ASIC PROTOTYPE

Peripherals Interface Multimedia Telecom/ Networking

Co- proc.

VALIDATION

Hardware-Software System Design and Validation Flow

(src: A. Raghunathan, NEC)

slide-12
SLIDE 12

Debugger Emulator Instruction Set Simulator Co-Simulator Software Tasks Software Implementation Mapping tasks to CPUs Multitask Scheduling

  • Priority selection

Multiprocessor Integration

  • Protocols
  • Shared Memory

Compiler Assembler Linker RTOS Estimators

  • Performance
  • Power

H /W

Embedded Software Implementation and Validation

(src: A. Raghunathan, NEC)

slide-13
SLIDE 13
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Outline

  • System

System-

  • level Design Methodologies

level Design Methodologies

  • System

System-

  • level power estimation

level power estimation

  • Low

Low-

  • power Embedded Software

power Embedded Software

  • System

System-

  • level Tradeoffs and Power Management

level Tradeoffs and Power Management

(src: A. Raghunathan, NEC)

slide-14
SLIDE 14
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level power estimation: summary

  • Background

Background

  • Power estimation in HW design

Power estimation in HW design

  • Embedded SW power estimation

Embedded SW power estimation

  • System

System-

  • level estimation approaches

level estimation approaches

  • The spreadsheet approach

The spreadsheet approach

  • Power state machines

Power state machines

  • HW/SW co

HW/SW co-

  • simulation

simulation

  • Battery modeling

Battery modeling

slide-15
SLIDE 15
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

HW Power Estimation: Overview

1 Power Cap Switching Power Short Ckt Power Leakage/Static Power = ( + . .

L dd

2 C V A f . . . .

2

)

Behavior level Register-transfer level Logic level Transistor level Power analysis iteration times seconds - minutes minutes - hours hours - days Decreasing design iteration times

High-level synthesis, RTL optimizations Architecture-level power analysis Logic synthesis Logic-level power analysis Transistor-level/ Layout synthesis Transistor-level power analysis Power models for macroblocks, control logic Power models for gates, cells, nets (src: A. Raghunathan, NEC)

slide-16
SLIDE 16
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Instruction-level SW power modeling

  • Energy consumed = f(Instruction sequence)

Energy consumed = f(Instruction sequence)

  • Model using

Model using per per-

  • instruction costs

instruction costs, , circuit state overhead costs circuit state overhead costs, and , and penalties for pipeline stalls and cache misses penalties for pipeline stalls and cache misses

  • Program

Program energy cost = energy cost =

Σ Σ I

I (Base

(Base I

I x N

x N I

I) +

) + Σ Σ I,J

I,J (

(Ovhd Ovhd I,J

I,J x N

x N I,J

I,J) +

) + Ν ΝCM

CM ∗

∗ Penalty PenaltyCM

CM +

+ Ν ΝStall

Stall ∗

∗ Penalty PenaltyStall

Stall

N N I

I

: Number of times instruction I is executed : Number of times instruction I is executed Base Base I

I

: Base energy cost of I (ignores stalls,cache misses) : Base energy cost of I (ignores stalls,cache misses) Ovhd Ovhd I,J

I,J : Circuit state overhead when I, J are adjacent

: Circuit state overhead when I, J are adjacent Penalty PenaltyCM

CM :

: Cache Cache Miss Penalty Miss Penalty Penalty PenaltyStall

Stall :

: Pipeline Pipeline Stall Penalty Stall Penalty

  • Circuit state overhead: depends on processor architecture

Circuit state overhead: depends on processor architecture

  • Constant value for 486DX2, Fujitsu

Constant value for 486DX2, Fujitsu SPARClite SPARClite

  • Table for Fujitsu DSP due to greater variation

Table for Fujitsu DSP due to greater variation

[Tiwari94c,Tiwari96]

(src: A. Raghunathan, NEC)

slide-17
SLIDE 17
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Building instruction-level power models

  • Characterize current drawn by

Characterize current drawn by CPU for given instruction CPU for given instruction sequence sequence

  • Simulation based methods

Simulation based methods

  • Simulate program execution on HW

Simulate program execution on HW models of the CPU models of the CPU

  • Physical measurement

Physical measurement

  • Digital ammeter

Digital ammeter

  • Put programs in loops

Put programs in loops

  • Get stable visual reading

Get stable visual reading

  • Processors: Intel 486DX, Fujitsu

Processors: Intel 486DX, Fujitsu SPARClite SPARClite, Fujitsu DSP

Current Clk

Integration Period

  • f Ammeter

Power Supply CPU Rest of the system A Current Measurement Setup

, Fujitsu DSP

(src: A. Raghunathan, NEC)

[Tiwari94c,Tiwari96]

slide-18
SLIDE 18
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Estimation Example: 486DX2

Base Cost Base Cost PROGRAM

PROGRAM =

=

Σ Σ Base

Base Cost CostBLOCKi

BLOCKi *

* Instances InstancesBLOCKi

BLOCKi

Estimated base current = Estimated base current =

Base Cost Base Cost PROGRAM

PROGRAM / 72

/ 72 =

= 369.0mA

369.0mA Final estimated current = 369.0 + 15.0 Final estimated current = 369.0 + 15.0 = = 384.0mA 384.0mA Measured current Measured current = = 385.0mA 385.0mA

  • Similar experiments in 486DX2 and

Similar experiments in 486DX2 and SPARClite SPARClite accurate to within 3% accurate to within 3% Block Instances B1 1 B2 4 B3 1 jl L2 (taken) 3 (not taken) 1 main: mov bp, sp sub sp, 4 mov dx, 0 mov word ptr -4[bp], 0 L2: mov si, word ptr -4[bp] add si, si add si, si mov bx, dx mov cx, word ptr _a[si] add bx, cx mov si, word ptr _b[si] add bx, si mov dx, bx mov di, word ptr -4[bp] inc di mov word ptr -4[bp], di cmp di, 4 jl L2 L1: mov word ptr _sum, dx mov sp, bp jmp main 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 3(1) 1 1 3 Cycles Program 285.0 309.0 309.8 404.8 433.4 309.0 309.0 285.0 433.4 309.0 433.4 309.0 285.0 433.4 297.0 560.1 313.1 405.7(356.9) 521.7 285.0 403.8 Base Cost (mA) B1 B2 B3 [Tiwari94c,Tiwari96]

(src: A. Raghunathan, NEC)

slide-19
SLIDE 19
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Estimation: The Spreadsheet Approach

  • Easiest to use for off

Easiest to use for off-

  • the

the-

  • shelf component based designs

shelf component based designs

  • Problem: Guesswork, does not account for inter

Problem: Guesswork, does not account for inter-

  • component interactions

component interactions

Component Model Parameters Power Processor

Pon * fon * %on + Pidle * %idle Pon: 12.5mw/MHz, Fon: 40MHz, Pidle: 10mW, %on: 0.7 303mW

DRAM

Vdd * (Ion * %on + Iidle * %idle) Vdd: 3.3V, Ion: 24mA, Iidle: 1mA, %on: 0.4 33.66mW

Flash

Vdd * (Ion * %on + Iidle * %idle) Vdd: 3.3V, Ion: 9mA, Iidle: 0mA, %on: 0.6 17.82mW

  • Audio. Amp.

Vsupply * [(Ibias+Idyn)*%on+ (Ioff*%off)] Vsupply: 3.3V, Ibias+Idyn <= 80mA, Ioff = 0.7uA,%on=0.2 52.8mW

Radio Subsystem

Hierarchical 287mW

Clock Gen

From datasheet P: 25mW 25mW

DC-DC Converter

Pload * (1-Eff)/Eff Eff:0.94, Pload:719mW 45.91mW

TOTAL

765.19mW

(src: A. Raghunathan, NEC)

[Lidsky96]

slide-20
SLIDE 20
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Estimation: Power State Machines

Sleep Idle Run StrongARM SA-1100 400mW 0.16mW 50mW

90uS 160mS 90uS 10uS 10uS Wait for interrupt Wait for wake-up

Applicable to processors, memories, network interfaces, displays, disk drives, …

  • States represent modes of operation, state transitions represent

States represent modes of operation, state transitions represent change of mode change of mode

  • Annotate states and transitions with power, performance, and eve

Annotate states and transitions with power, performance, and event nt information information

  • Use simulation, probabilistic analysis, or statistical technique

Use simulation, probabilistic analysis, or statistical techniques for s for system power estimation system power estimation

[DeMicheli00]

(src: A. Raghunathan, NEC)

slide-21
SLIDE 21
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power Estimation through HW/SW Co-Simulation

Application Program (Assembly) ISS Processor model Host memory Bus functional model Memory model HDL model

Memory & Signal Synchronization

  • Co

Co-

  • simulation is necessary in order to capture interactions

simulation is necessary in order to capture interactions between system components between system components

  • Heterogeneous power estimation techniques for various

Heterogeneous power estimation techniques for various system components system components

[Lajolo99,Simunic99a,Lajolo00]

(src: A. Raghunathan, NEC)

slide-22
SLIDE 22

POLIS/PTOLEMY C spec for ISS SW compiler

  • bj

file gate-level RTL / power estimator state, input values, breakpoints, commands cycles, power HW netlist HW/SW partition Delay, energy characteristics

  • Pre-designed

IP libraries

  • uP/uC cores

synthesis fast power input vectors, state, commands sampling, caching

ISS SYSTEM SPEC

VISUAL DISPLAY

BUS energy SW energy HW energy

[Lajolo99,Lajolo00]

Power Estimation through HW/SW Co-Simulation

(src: A. Raghunathan, NEC)

slide-23
SLIDE 23

Power Estimation through HW/SW Co-Simulation Power Estimation through HW/SW Co Power Estimation through HW/SW Co-

  • Simulation

Simulation

SYSTEM SPEC. SW and HW power/energy plots

BUS ACCESS/CONFLICT WAVEFORMS & BUS POWER

[Lajolo99,Lajolo00]

(src: A. Raghunathan, NEC)

slide-24
SLIDE 24
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Battery life estimation

  • Battery life cannot be determined from total energy (or

Battery life cannot be determined from total energy (or average power) alone average power) alone

  • Discharge rate effect

Discharge rate effect

  • Recovery effect

Recovery effect

  • Need energy consumption waveforms

Need energy consumption waveforms

  • Co

Co-

  • simulate battery model together with system

simulate battery model together with system

20 40 60 80 100 120 0.5 1 2 3 Normalized Discharge Current Battery Efficiency (src: A. Raghunathan, NEC)

[Pedram99,Chiasserini99,Simunic99b,Martin99,Benini00,Panigrahi01]

slide-25
SLIDE 25
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Outline Outline

  • System

System-

  • level Design Methodologies

level Design Methodologies

  • System

System-

  • level power estimation

level power estimation

  • Low

Low-

  • power Embedded Software

power Embedded Software

  • System

System-

  • level Tradeoffs and Power Management

level Tradeoffs and Power Management

slide-26
SLIDE 26

Source Target Memory Image Target-independent

  • ptimizations

System Software: RTOS, Device drivers, …

Low-Power Software: Summary

Code generation Assembler/Linker Libraries Target architecture model Low-power compilers:

  • transformations
  • code generation
  • memory layout
  • code compression

Low-power OS, middleware

  • power management
  • voltage/clock speed

scheduling Power efficient Source Code Instruction-level power model ISS, debugger Co-simulator

HW

(src: A. Raghunathan, NEC)

slide-27
SLIDE 27
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Low-power compilers

  • Use instruction

Use instruction-

  • level energy costs to guide code generation

level energy costs to guide code generation

  • Minimize memory accesses

Minimize memory accesses

  • Utilize registers effectively

Utilize registers effectively

  • Reduce context saving

Reduce context saving

  • Processor

Processor-

  • specific optimizations

specific optimizations

  • Dual memory loads, instruction packing

Dual memory loads, instruction packing

  • Optimize instruction scheduling to reduce activity in specific

Optimize instruction scheduling to reduce activity in specific parts of the system parts of the system

  • Internal Instruction

Internal Instruction-

  • bus, processor

bus, processor-

  • memory bus, Instruction register

memory bus, Instruction register and register decoder and register decoder

(src: A. Raghunathan, NEC)

[Tiwari94b,Tiwari96,Su94,Tomiyama98,Mehta97,Kandemir00]

slide-28
SLIDE 28
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Software power optimization: Example

  • Aggressive register optimizations

Aggressive register optimizations

  • Original code:

Original code: lcc lcc

  • Optimized code: hand

Optimized code: hand-

  • generated

generated

  • 9% current reduction

9% current reduction

  • 24% running time reduction

24% running time reduction

  • 40.6%

40.6% energy reduction energy reduction

  • 33% for

33% for circle circle

push ebx push esi push edi push ebp mov ebp,esp sub esp,24 mov edi,dword ptr 014H[ebp] mov esi,1 mov ecx,esi mov esi,edi sar esi,cl lea esi,1[esi] mov dword ptr -20[ebp],esi mov dword ptr -8[ebp],edi L3: mov edi,dword ptr -20[ebp] cmp edi,1 jle L7 mov edi,dword ptr -20[ebp] sub edi,1 mov dword ptr -20[ebp],edi lea edi,[edi*4] mov esi,dword ptr 018H[ebp] add edi,esi mov edi,dword ptr [edi] mov dword ptr -12[ebp],edi jmp L8 L7: mov edi,dword ptr 018H[ebp] mov esi,dword ptr -8[ebp] lea esi,[esi*4] add esi,edi mov ebx,dword ptr [esi] mov dword ptr -12[ebp],ebx mov edi,dword ptr 4[edi] mov dword ptr [esi],edi mov edi,dword ptr -8[ebp] sub edi,1 mov dword ptr -8[ebp],edi cmp edi,1 jne L8 mov edi,dword ptr 018H[ebp] mov esi,dword ptr -12[ebp] mov dword ptr 4[edi],esi jmp L2

Compiler Generated Code

push ebp mov edi,dword ptr 08H[esp] mov esi,edi sar esi,1 inc esi mov ebp,esi mov ecx,edi L3: cmp ebp,1 jle L7 dec ebp mov esi,dword ptr 0cH[esp] mov edi,dword ptr[edi*4][esi] mov ebx,edi jmp L8 L7: mov edi,dword ptr 0cH[esp] mov esi,dword ptr 4[edi] mov ebx,dword ptr [ecx*4][edi] mov dword ptr [ecx*4][edi],esi dec ecx cmp ecx,1 jne L8 mov dword ptr 4[edi],ebx jmp L2

Energy Efficient Code

Program sort circle Version Original Final Original Final Current (mA) 525.7 486.6 530.2 514.8

  • Ex. Time (ms)

11.02 7.07 7.18 4.93 Energy (10-6J) 19.12 11.35 12.56 8.37 Saving 40.60% 33.40%

heapsort example

[Tiwari94b,Tiwari96]

(src: A. Raghunathan, NEC)

slide-29
SLIDE 29
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Writing Energy-Efficient Source Code

  • Power

Power-

  • conscious source coding leads to efficient

conscious source coding leads to efficient implementations implementations

  • Guidelines vary depending on processor architecture, compiler

Guidelines vary depending on processor architecture, compiler

  • Example: ARM family

Example: ARM family

  • Conditional execution (pipeline

Conditional execution (pipeline-

  • friendly)

friendly)

  • Switch

Switch vs.

  • vs. table lookup

table lookup

  • Make copies of variable whose pointer is passed as function

Make copies of variable whose pointer is passed as function argument (enables argument (enables beter beter register assignment) register assignment)

  • Variable Types (

Variable Types (int int more efficient than more efficient than char char or

  • r short

short) )

  • Function call overhead (register

Function call overhead (register-

  • passed arguments, tail recursion,

passed arguments, tail recursion, recursive recursive vs.

  • vs. iterative)

iterative)

  • 32.3% energy savings for MPEG encoder

32.3% energy savings for MPEG encoder

(src: A. Raghunathan, NEC)

[Simunic99b]

slide-30
SLIDE 30
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Code Compression for Low Power

Decompress

Program Memory (Compressed Software)

CPU

Add hardware

App. Specific HW

Bus Contr.

Data Memory

Peripherals

Reduce program memory

  • Compress code, and decompress it on-the-fly.
  • Fewer memory & bus transactions
  • CPU stalls reduced
  • Encoding can be optimized for reduced switching activity

[Lekatsas00a,Lekatsas00b,Benini99a,Yoshida97]

slide-31
SLIDE 31
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Code Compression: Architectures

CPU Main Memory

  • AddressBus

32 32 32

CPU Main Memory Decomp. Engine Data Bus 1 Data Bus 2 D-cache D-cache I-cache I-cache AddressBus

32 32 32

Pre-Cache Architecture

CPU Main Memory

  • I-cache

32 32

CPU Main Memory Data Bus 1 Data Bus 2 D-cache D-cache I-cache AddressBus

32 32 32

Post-Cache Architecture

Decomp. Engine

Compression Compilation Code Optimization

Source Code Object Code Decompression Hardware

  • Savings in Main mem., DataBus2, decompress
  • nly on cache miss
  • Savings: Main mem., DataBus1, DataBus2,

decompress more critical, I-cache magnified

slide-32
SLIDE 32
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Code Compression: Results

Experimental

methodology

Select minimum

energy cache configuration

Apply compression Resize cache to

maintain performance

  • Fujitsu

Fujitsu SPARClite SPARClite MB86934 processor + Memory hierarchy + MB86934 processor + Memory hierarchy + decompression hardware decompression hardware

  • Power, performance estimation using cycle

Power, performance estimation using cycle-

  • accurate simulation

accurate simulation [Li98] [Li98]

  • 70
  • 60
  • 50
  • 40
  • 30
  • 20
  • 10

i3d mpeg smo trick Improvements (% ) Area

  • Ex. Time

Energy [Lekatsas00a,Lekatsas00b]

slide-33
SLIDE 33
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Low-Power System Software

  • Task management, scheduling, inter

Task management, scheduling, inter-

  • process

process communication, memory management, timer services, communication, memory management, timer services, interrupt handlers, device drivers, network protocol stacks, interrupt handlers, device drivers, network protocol stacks, … …

  • Need awareness of power consumption in various system

Need awareness of power consumption in various system software primitives, effects on application power software primitives, effects on application power

  • Typically used as

Typically used as “ “black black-

  • box

box” ” by application software designers by application software designers

  • Customize RTOS to application

Customize RTOS to application

  • Scheduling priority selection, granularity

Scheduling priority selection, granularity

  • Optimize application considering RTOS effects

Optimize application considering RTOS effects

  • When to use

When to use “ “light light-

  • weight

weight” ” context switch context switch

  • Function call vs. process

Function call vs. process

  • Shared memory vs. message passing

Shared memory vs. message passing

[Dick00]

slide-34
SLIDE 34
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

IPC Scheduler context switch Interrupt & exception handlers Task manager Timer Memory manager email

...

MPEG enc. task

Application program

Send connect + SYN packet Await end of packets Release, Send ACK/ACK Await seq. no., ACK Send disconnect REQ Send SYN + ACK/ACK + data Await disconnect ACK

Ethernet RPC UDP TCP IP EARP

DSP core Program RAM µP core Hardware Components High

  • Speed HW

accelerators Glue Logic Program RAM Data RAM A/D & D/A Host interface System -on -chip SOC Bus

RTOS

Send SYN packet

TCPcompose_pkt(); alloc_send_buf(); OS: VMM TCP_send_pkt(); OS: IPC OS: c-switch IP_send_pkt(); OS: scheduler OS: c-switch Ether_dev_send(); OS: c-switch

Await ACK

init_timer(); OS: timer alloc_rcv_buf(); OS: VMM TCP_await_ack(); OS: scheduler OS: c-switch

Send data packet

TCPcompose_pkt(); alloc_send_buf(); OS: VMM TCP_send_pkt(); OS: IPC OS: c-switch IP_send_pkt(); OS: scheduler OS: c-switch Ether_dev_send(); OS: c-switch

Time OS::VMM OS::IPC OS::Context- switch OS::VMM OS::VMM

[Dick00]

Low-Power System Software

slide-35
SLIDE 35
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power Analysis of an Embedded RTOS

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Mailbox Semaphore TCP-1 TCP-2 ABS-1 ABS-2 Legend

Energy (% of total)

Other Application Timer I/O Primitives Interrupt/Exception Handling

  • Mem. Management

Scheduling/Context Sw. Inter Process Commn. Task Management

  • RTOS:

RTOS: µ µC C/OS (micro /OS (micro-

  • C)

C)

  • Directly accounts for 5%

Directly accounts for 5% -

  • 50% of system energy for various applications

50% of system energy for various applications

  • Optimizing application considering RTOS can save 20%

Optimizing application considering RTOS can save 20% -

  • 60%

60%

[Dick00]

IBM 0118160PT3-60 DRAM IBM 0118160PT3-60 DRAM Fujitsu SPARClite 86832

On-chip cache

Timer EPROM UART Other Peripherals Interrupts LED cntr.

slide-36
SLIDE 36
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Outline

  • System

System-

  • level Design Methodologies

level Design Methodologies

  • System

System-

  • level power estimation

level power estimation

  • Low

Low-

  • power Embedded Software

power Embedded Software

  • System

System-

  • level Tradeoffs and Power Management

level Tradeoffs and Power Management

slide-37
SLIDE 37
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Tradeoffs and Management

  • Energy

Energy vs

  • vs. Programmability

. Programmability

  • System

System-

  • level Power Management

level Power Management

  • Variable voltage systems

Variable voltage systems

  • Power

Power-

  • efficient system integration/communication

efficient system integration/communication architectures architectures

slide-38
SLIDE 38
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Energy vs. Programmability

  • Large (100X

Large (100X – – 1000X) gap in energy efficiency between 1000X) gap in energy efficiency between fully programmable and fully custom implementations fully programmable and fully custom implementations

  • Ample scope for tradeoffs

Ample scope for tradeoffs

Source: Rabaey et. al., IEEE Computer, July 2000

slide-39
SLIDE 39
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Hardware/Software Partitioning for Low Power

  • Partition system

Partition system-

  • level description into HW and SW

level description into HW and SW

  • Compact application

Compact application-

  • specific hardware generally more efficient

specific hardware generally more efficient

  • Example:

Example:

  • Power to perform addition =

Power to perform addition = 330mW 330mW using using SPARClite SPARClite processor processor core core

  • Custom adder in same technology consumes

Custom adder in same technology consumes 2mW 2mW

  • However,

However, need to consider communication overhead! need to consider communication overhead!

  • Partitioning at different levels of granularity possible

Partitioning at different levels of granularity possible

  • Operator level

Operator level

  • Basic block level

Basic block level

  • Function/procedure level

Function/procedure level

  • Task level

Task level

[Dave97,Kirovski97,Henkel98]

slide-40
SLIDE 40
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

HW/SW Partitioning: Example

  • Example:

Example:

  • HDTV

HDTV chromakey chromakey algorithm algorithm

  • 22,000 lines of C code

22,000 lines of C code

  • Partitioning:

Partitioning:

  • Approx. 15 lines of code
  • Approx. 15 lines of code

(critical loops) synthesized as (critical loops) synthesized as ASIC ASIC

  • Result: 77% energy

Result: 77% energy savings savings

  • Procedure at a glance:

Procedure at a glance:

  • form instruction cluster from

form instruction cluster from behavioral description behavioral description

  • schedule

schedule

  • estimate utilization rate of

estimate utilization rate of resources within each cluster resources within each cluster

  • estimate energy dissipation

estimate energy dissipation and performance and performance

Implemented as ASIC

... For (I=cr1;I<=cr2;I++) { ihilf=abs(cr-I)+abs(cb-vtab[I]); if (ihilf<iabsv) { iabsv=ihilf; } iabsv=512; For (I=cr1;I<=cr2;I++) { ihilf=abs(cr-I)+abs(cb-htab[I]); if (ihilf<iabsh) { iabsh=ihilf; } } 21698 21715 If (cb>vtab[cr] { 21687 .. . ... ... ...

[Henkel98]

slide-41
SLIDE 41
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

“Avalanche” System-level low power estimation and optimization

(Li/Henkel) (src: A. Raghunathan, NEC)

slide-42
SLIDE 42
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Goals and Assumptions

HW/SW embedded system CPU (Software) ASIC (Hardware) Memory hierarchy I/O

Need for fast power estimation because of size of

SOCs

Enable comprehensive design space explorations Estimation as a design aid rather than absolute

estimation

Optimize using

parameterized design paradigm

Key: interdependencies of

  • f system parts (cores) in

terms of power/performance

(Li/Henkel)

slide-43
SLIDE 43
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-Level Low-Power Design:Tradeoffs

Key is to evaluate the tradeoffs:

Power/Energy vs. performance tradeoff Energy interdependencies: software, hardware, cache/memory

Example:

Cache size System performance

CPU energy Cache energy Memory energy System energy ?

(Li/Henkel)

slide-44
SLIDE 44
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Main Memory Energy Model

Memory

iact iren idec: constants Imem = Iactive + Iretention

+ Icol-dec + Irow-dec + Iperi

Iactive = m * iact Iretention = m (n-1) * iren Icol-dec = m * idec Irow-dec = n * idec

Column decoder Row decoder Periphery

Array m n

  • I

I active

active

dominating source [ dominating source [Itoh Itoh et al] et al]

  • I

I retention

retention negligible at high frequencies

negligible at high frequencies

  • I

I mem

mem

most sensitive to most sensitive to m

(Li/Henkel)

m

slide-45
SLIDE 45
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Cache Energy Model

Data / address outputs to CPU / memory Address input from CPU

N-col

n

Tag Dat Tag Dat Tag Dat

Comparators / Mux N-row

Cache

Cache parameters

  • Size: cache size
  • Asso: associativity
  • Line: block size
  • Tag: # tag bits
  • Status: # status bits
  • write policy

Ncol = Asso * ( 8 *Line + Tag + Status) Nrow = Size / (Line * Asso) Ecache = Ebitlines+ Eword-lines + E output + Einput

(Li/Henkel)

  • E

E output

  • utput different values for

different values for on

  • n-
  • chip or off

chip or off-

  • chip

chip main memory main memory

slide-46
SLIDE 46
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System Energy Estimation Flow

  • f Avalanche

Application program Behavior simulator (Sparcsim) Trace generator (QPT) Software energy model Cache simulator (Dinero) Main memory energy model Cache energy model Performance model

+

System energy System performance Bus energy model

(Li/Henkel)

slide-47
SLIDE 47
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Design Space Exploration

  • MPEG

MPEG-

  • 2 video encoder

2 video encoder

  • Source: MPEG Software Simulation Group

Source: MPEG Software Simulation Group

  • 200KB source code in C

200KB source code in C

  • Simulation trace: ~100M instructions

Simulation trace: ~100M instructions

  • SPARC RISC: 71% instruction fetch, 29% data access

SPARC RISC: 71% instruction fetch, 29% data access

Energy Execution time D-cache I-cache D-cache I-cache

(Li/Henkel)

slide-48
SLIDE 48
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Power/Energy Sources

Instruction cache size (2x)

MPEG-2 video encoder

Software I-cache D-cache Memory

90 50 10 Energy percentage ( %)

(Li/Henkel)

slide-49
SLIDE 49
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Source-level Transformations

test1 main test2

(100*EES(test2), size(test2)) (10,000*EES(test1), size(test1)) (100*EES(test1), size(test1)) test1 test1( ) ( ) test2 test2 () () { { … … … … { body2; } { body2; } test2(); test2(); … … … … } }

main() main() { { int int j, k; j, k; … … … … for ( j=0; j<100; j++) for ( j=0; j<100; j++) { { test1(); test1(); for (k=0; k<100; k++) for (k=0; k<100; k++) test1(); test1(); } } … … … … } }

Loop k Loop j

(EES_loop, size(loop k) (EES_loop, size(loop j))

Procedure call graph Loop nesting graph (EES, CSI)

(Li/Henkel)

slide-50
SLIDE 50
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Source-level Transf. (cont’d)

Procedure calling graph

  • New edges after loop unrolling
  • (EES, CSI) changes

test1 main test2 test1 main test2

After unrolling loop j 2 folds

test1 test1( ) ( ) test2 test2 () () { { … … … … { body2; } { body2; } test2(); test2(); … … … … } } main() main() { { int int j, k; j, k; … … … … for ( j=0; j<100; j++) for ( j=0; j<100; j++) { { test1(); test1(); for (k=0; k<100; k++) for (k=0; k<100; k++) test1(); test1(); j++; j++; test1(); test1(); for (k=0; k<100; k++) for (k=0; k<100; k++) test1(); test1(); } } … … … … } }

(Li/Henkel)

slide-51
SLIDE 51
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Source-level Transf. (cont’d)

test1 main test2

Procedure call graph

  • edge removed after procedure in-lining

test1 main

After in-lining procedure test2

test2

test1 test1( ) ( ) test2 test2 () () { { … … … … { body2;} { body2;} body2; body2; … … … … } } main() main() { { int int j, k; j, k; … … … … for ( j=0; j<100; j++) { for ( j=0; j<100; j++) { test1(); test1(); for (k=0; k<100; k++) for (k=0; k<100; k++) test1(); test1(); } } … … … … } }

(Li/Henkel)

slide-52
SLIDE 52
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Optimization Flow in Avalanche

Construct procedure call graph & loop nesting graph Obtain feasible memory/cache sizes Program I-cache / D-cache selection Main memory selection Software transformation Design evaluation System energy & performance estimation Optimized solution

(Li/Henkel)

slide-53
SLIDE 53
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Energy Optimization Results

Instruction cache size (2^x) Energy Joule

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 sort eg2 ismooth itimp no cache fixed cache software Goal I

Energy ( Joule )

(Li/Henkel)

slide-54
SLIDE 54
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Execution Time

Instruction cache size (2^x) Energy Joule

20 40 60 80 100 120 140 sort eg2 ismooth itimp no cache fixed cache software Goal I

Execution time (106 cycles)

(Li/Henkel)

slide-55
SLIDE 55
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Avalanche Flow

Processor Core Energy Model Main Memory Core Energy Model Cache Core Energy Models (instruction and data cache) ISS Divide Application in Cluster: - (nested) loops

  • functions
  • ...

List Schedule Application Cache Profiler (Dinero III) Tracing (QPT) Selected Cluster Compute Utilization Rate UcoreuP Compute Utilization Rate URcore Constraints:

  • #clusters
  • effort related

HW resources Available resources #resource s

initial used act core used act non core Ncore i

E E E if

i i

< +

∑ =

) (

_ _ _ 1

Behavioral Synthesis CYBER RTL Simulation VSIM RTL + Logic Synthesis VARCHSYN PWC netlist RTL VHDL Estimating Switching Energy CSIM CMOS6 Lib.

=

=

core

N i core i total

E E

1

schedule Select Cluster with highest URcore timing evaluate

LP Partitioning Core Energy Estimation HW Synthesis

Code Compression

Bus Power

LP-Partitioning (DAC99) Avalanche System (DAC’98) NEC Inhouse Flow

CC (DAC’00, DCC’00) Bus Power (DAC’01 accepted)

slide-56
SLIDE 56
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Management

  • Each component has multiple power modes

Each component has multiple power modes

  • Power modes present power

Power modes present power vs

  • vs. performance tradeoff

. performance tradeoff

  • System

System-

  • level power manager

level power manager

  • Observes events from various parts of system

Observes events from various parts of system

  • Issues power management commands to each component

Issues power management commands to each component

  • Component

Component-

  • level design issue: power modes

level design issue: power modes

  • System

System-

  • level design issue: power management policy

level design issue: power management policy

C2 C1 C3 C4 Power Manager System Observations: Workload, activity Commands: Change power mode

200mW

Run Idle

15mW 160ms 90us [DeMicheli00]

(src: A. Raghunathan, NEC)

slide-57
SLIDE 57
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

OS-driven power management

  • A

Advanced dvanced P Power

  • wer

C Configuration

  • nfiguration

I Interface nterface

  • Standards for

Standards for definition of definition of hardware resources hardware resources to enable OS to enable OS-

  • driven

driven power management power management

  • Intel, Microsoft,

Intel, Microsoft, Toshiba Toshiba

(src: A. Raghunathan, NEC)

slide-58
SLIDE 58
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Management Policies

  • Taxonomy of power management policies

Taxonomy of power management policies

  • Deterministic

Deterministic (know periods of inactivity, for sure) (know periods of inactivity, for sure)

  • Predictive

Predictive (guess inactive periods) (guess inactive periods)

  • Timeout

Timeout-

  • based

based

  • Determined offline (by analyzing system execution profiles)

Determined offline (by analyzing system execution profiles)

  • Adaptive

Adaptive

  • Break

Break-

  • even time (

even time (T TBE

BE)

): Minimum inactive period required to compensate the cost : Minimum inactive period required to compensate the cost

  • f shut
  • f shut-
  • down and wake

down and wake-

  • up

up

200mW

Run Idle

15mW 160ms 90us

10 20 30 40 50 60 70 80 40 80 120 160 200 240 280 320 360 Idle period (ms) Energy (mJ) With P.M. No P.M.

TBE* PRUN = TRUN-IDLE * PRUN-IDLE + TIDLE-RUN * PIDLE-RUN + (TIDLE - TRUN-IDLE - TIDLE-RUN )* PIDLE TBE = 160.09 ms

[DeMicheli00]

(src: A. Raghunathan, NEC)

slide-59
SLIDE 59
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Management Policies

  • Timeout

Timeout-

  • based

based

  • Trigger power mode transition if observed idle time >

Trigger power mode transition if observed idle time > T TTO

TO

  • Actual idle period should be >

Actual idle period should be > T TTO

TO + T

+ TBE

BE for power savings

for power savings

  • Predictive Shut

Predictive Shut-

  • Down

Down

  • Predicted Idle Time as a function of history

Predicted Idle Time as a function of history T TPRED

PRED = f( T

= f( TACTIVE

ACTIVE(n), T

(n), TIDLE

IDLE(n

(n-

  • 1),

1),… …,T ,TACTIVE

ACTIVE(n

(n-

  • k),T

k),TIDLE

IDLE(n

(n-

  • k

k-

  • 1) )

1) )

  • Collect system execution traces and use regression to compute

Collect system execution traces and use regression to compute f f() ()

  • Shut down if

Shut down if T TPRED

PRED > T

> TBE

BE

  • Optional: Wake up after

Optional: Wake up after T TPRED

PRED even if no activity (tradeoff

even if no activity (tradeoff performance penalty for reduced power savings) performance penalty for reduced power savings)

[Srivastava96,Hwang97,DeMicheli00]

(src: A. Raghunathan, NEC)

slide-60
SLIDE 60
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Management Policies

  • Adaptive techniques

Adaptive techniques

  • Workload unknown or highly time

Workload unknown or highly time-

  • varying

varying

  • E.g:

E.g: T TPRED

PRED(n) =

(n) = α α * T * TIDLE

IDLE(n

(n-

  • 1) + (1

1) + (1-

  • α

α) * T ) * TPRED

PRED(n

(n-

  • 1)

1) Saturation condition: T Saturation condition: TPRED

PRED(n) < C

(n) < CMAX

MAX * T

* TPRED

PRED(n

(n-

  • 1)

1) Timeout and re Timeout and re-

  • evaluate T

evaluate TPRED

PRED to avoid under

to avoid under-

  • prediction

prediction

  • Special techniques for disk drives

Special techniques for disk drives

  • Sessions (clusters) of high activity

Sessions (clusters) of high activity

  • Use adaptation to predict session length

Use adaptation to predict session length

(src: A. Raghunathan, NEC)

[Douglis95,Krishnan95,Helmhold96,Hwang97,Lu99]

slide-61
SLIDE 61
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Power Management Policies

1

0.15 0.05 0.95 0.85

Workload

  • n
  • ff

s_on: 0.1 s_off: 0.0 s_on: 0.0 s_off: 0.2 s_on: 1.0 s_off: 0.8 s_on: 0.9 s_off: 1.0

Power Managed Component

Policy: {(0,on) a s_off, (1,on) a s_on, (0,off) a s_off, (1,off) a s_on }

  • Stochastic policies

Stochastic policies

  • System behavior and workload are not deterministic

System behavior and workload are not deterministic

  • Model using discrete

Model using discrete-

  • time Markov chains

time Markov chains

  • Compose Markov chains representing system and workload

Compose Markov chains representing system and workload

  • Solve linear program to determine optimal policy

Solve linear program to determine optimal policy

  • Extensions for continuous

Extensions for continuous-

  • time (asynchronous) policies, non

time (asynchronous) policies, non-

  • stationary workloads

stationary workloads

(src: A. Raghunathan, NEC)

[Paleoglo98,Qiu99a,Chung99,Simunic99s,Lu00]

slide-62
SLIDE 62
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Variable voltage processing

  • Stream

Stream-

  • driven computations

driven computations

  • Adaptively vary supply voltage to keep processor

Adaptively vary supply voltage to keep processor “ “just just” ” fast enough fast enough

Data processing unit FIFO FIFO

DC voltage converter Load sensor Data in Data out Voltage-dependent throughput

Conventional power management:

Eactive for time Tactive, Eidle for time Tidle ; total energy = Eactive+ Eidle

Adaptive voltage scaling

Eactive for time Tactive, Eidle for time Tidle ; total energy = Eactive* (Tactive)/(Tactive + Tidle)

Shut down = Wasted power!!

[Nielsen94, Chandrakasan96]

(src: A. Raghunathan, NEC)

slide-63
SLIDE 63
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Dynamic Voltage Scaling

  • Better power/performance tradeoff

Better power/performance tradeoff

  • Fills idle time to allow operation at

Fills idle time to allow operation at minimum required voltage minimum required voltage

  • Run

Run-

  • time scheduler determines

time scheduler determines processor speed, voltage processor speed, voltage

  • E.g

E.g.: Interval scheduling .: Interval scheduling

  • Adjust Processor Speed Of

Adjust Processor Speed Of Fixed Time Intervals Fixed Time Intervals

  • Base decision on global

Base decision on global processor utilization Speed (MHz) Power Fixed-V DVS processor utilization

Time Proc Utilization Proc Speed

(src: A. Raghunathan, NEC)

[Yao95,Govil95,Hong98,Pering98,Ishihara98,Lee00,Shin00,Kumar00]

slide-64
SLIDE 64
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Scheduling for Variable Voltage Processors

  • Statically scheduled systems

Statically scheduled systems

  • Voltage schedule can also be determined statically

Voltage schedule can also be determined statically

task arrival deadline execution time at 3.3V A 6 5 B 3 20 5

Real-time tasks: average power consumption(APC) = 1 watt

Task A Task B

5 10 20 3.3V

* System shutdown technique: APC = 0.5 watts Task A Task B

6 12 20 2.97V 1.95V

* Variable voltage hardware: APC = 0.28 watts Task A Task B

6 12 20 2.97V

* Supply voltage scaling and system shutdown: APC=0.4watts

[Hong98]

(src: A. Raghunathan, NEC)

slide-65
SLIDE 65
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

System-level Communication Architecture

  • Topology

Topology

  • Dedicated

Dedicated vs

  • vs. shared interconnect

. shared interconnect

  • Partitioned / hierarchical

Partitioned / hierarchical vs

  • vs. flat buses

. flat buses

  • Map communication between components to topology

Map communication between components to topology

  • Protocols

Protocols

  • Round

Round-

  • robin

robin vs

  • vs. Static priority

. Static priority vs

  • vs. TDMA

. TDMA

  • Select priorities, DMA modes, split bus transactions,

Select priorities, DMA modes, split bus transactions, … …

  • Circuits

Circuits

  • Low

Low-

  • swing signaling

swing signaling

  • Partial charging /

Partial charging / dis dis-

  • charging of bus lines

charging of bus lines

  • Activity reduction

Activity reduction

  • Bus Encoding

Bus Encoding

(src: A. Raghunathan, NEC)

[Stan95,Benini97,Stan97,Zhang98,Givargis98,Ramprasad98,Mussoll98]

slide-66
SLIDE 66
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

ip ip

Peripheral Ext Access Ext Access (Test) (Test)

High Speed High Speed Low power Low power

Peripheral Peripheral External External Bus Bus Interface Interface Bridge Bridge CPU CPU ROM ROM RAM RAM DMA DMA Peripheral Peripheral

ASB ASB APB APB

Source: ARM Ltd.

AMBA Bus: High Speed, Low Power Tradeoff

slide-67
SLIDE 67

Customization of bus protocol: Example

HW

BUS

MEM CHK SUM IP CHK ARB Proc. SPARC

CREATE PACK IP CHK ARB CHKSUM QUEUE MEM

NETWORK

PHY AAL ATM IP TCP

NETWORK INTERFACE CARD

Network Host

[Lajolo00]

(src: A. Raghunathan, NEC)

slide-68
SLIDE 68
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Customization of bus protocol: Example

  • Design of bus architecture significantly impacts total system po

Design of bus architecture significantly impacts total system power (> wer (> 2X variation) 2X variation)

  • Bus power consumption significant in sub

Bus power consumption significant in sub-

  • micron technologies

micron technologies

  • Impact on power consumption in components (handshaking, idle tim

Impact on power consumption in components (handshaking, idle time e caused by bus conflicts) caused by bus conflicts)

[Lajolo00]

(src: A. Raghunathan, NEC)

slide-69
SLIDE 69
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Bus Encoding

slide-70
SLIDE 70
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Simplified Capacitance Bus Model

w h H D

C C

i i+ 1 Substrate Metal Layer

R

B C i,i+ 1

: : ...

' ' ' 1 , ' 2 , ' 1 , ' ' C B N C C C Bi

  • C

C C C C C C

+ + + + =

Base capacitance (“intrinsic”) Coupling capacitance

(Henkel/Lekatsas)

slide-71
SLIDE 71
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

General Formulation of Wire Cap

( )

( )

− ≠ =

× × + =

1 , , ' , '

, _

N i j j j i j Ci i B i

x j i fct s C C C

( ) { }:

1 , : _

, ∈ j i

x fct s

‘shield-factor’ depends on other switching activity (see later)

(Henkel/Lekatsas)

  • Practically, only closest (wire

Practically, only closest (wire-

  • )neighbors matter

)neighbors matter

slide-72
SLIDE 72
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Determining x(i,j)

  • Inter

Inter-

  • wire capacitances may or may not cause energy

wire capacitances may or may not cause energy consumption consumption

  • It depends on what the state/action of an adjacent line is

It depends on what the state/action of an adjacent line is

  • Some of the cases:

Some of the cases:

(Henkel/Lekatsas)

slide-73
SLIDE 73
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Possible Form-factors and their Characteristics (Approx.)

' 1 , ' ' +

+ =

i Ci Bi i

C C C ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + × × + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ × ≈ h w D b h H ar a

r r

2 ln cosh 2 ε πε ε πε

w h h w h w Form-factor A Form-factor C Form-factor B

(Henkel/Lekatsas)

slide-74
SLIDE 74
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Capacitances and Tech Params

(Henkel/Lekatsas)

slide-75
SLIDE 75
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Switching Distribution

  • Shown is activity on a non

Shown is activity on a non-

  • decoded address bus

decoded address bus

  • Unveils activity plus program/task size (NOT address

Unveils activity plus program/task size (NOT address space) space)

  • Question: can activity distribution be exploited ?

Question: can activity distribution be exploited ?

1 10 100 1000 10000 100000 1000000 10000000 100000000 bit line # transitions

(Henkel/Lekatsas)

slide-76
SLIDE 76
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Relative max. Bus Capacitance Distribution

5 10 15 20 25 30 bus line

  • rel. increase of max.

possible capacitance [%]

(Henkel/Lekatsas)

  • Assumptions: a) no adjacent effects, b)

Assumptions: a) no adjacent effects, b)

  • Shown: results of fully correlated model; in reality: only close

Shown: results of fully correlated model; in reality: only closest st neighbors neighbors

  • Observation: unequal distribution may be exploited by bus encodi

Observation: unequal distribution may be exploited by bus encoding ng

slide-77
SLIDE 77
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

ACCS Schemes

  • Redirect lines acc.

Redirect lines acc. To activity To activity

  • Distribute for max

Distribute for max decoupling decoupling

  • 2 schemes: one for

2 schemes: one for small, one for large small, one for large address space address space

  • Switch in between

Switch in between when switching when switching task/program task/program

  • Supported by OS

Supported by OS

  • Switching energy

Switching energy negligible

31 31 31 31 w0 w0 w2 w15 w7 w1 source target source target

S cheme 1 S cheme 2

negligible

(Henkel/Lekatsas)

slide-78
SLIDE 78
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

ETAM: Extended Transition Activity Measure

( ) ( ) ( )

∑ ∑

∈ ∀ ≠ ∈ ∀ − −

⎟ ⎟ ⎠ ⎞ ⎜ ⎜ ⎝ ⎛ ⊕ × ⊕ + ⊕ =

w b b w b j i i i i i

bi i j i

B B B B B B w ETAM

, 1 1

) (

t-1 t w i= a+ 1 1 1 1

  • l= a

h= b= a+ 3 t-1 t w i= a+ 2 1 1 1

  • -

l= a h= b= a+ 3 ETAM=2 ETAM=0

(Henkel/Lekatsas)

slide-79
SLIDE 79
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Results: Energy Savings [%]

  • Up to 50+% power

Up to 50+% power savings on address savings on address bus bus

  • 4

4-

  • bit window delivers

bit window delivers better results better results

  • Applicable to

Applicable to SOCs SOCs with long buses with long buses

  • No a priori knowledge

No a priori knowledge

  • f application
  • f application

necessary necessary

(Henkel/Lekatsas)

slide-80
SLIDE 80
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

Bus Encoding: others

  • Encode data transmitted over bus to minimize switching activity

Encode data transmitted over bus to minimize switching activity

  • Power saved in bus interconnect and drivers should outweigh powe

Power saved in bus interconnect and drivers should outweigh power r consumed in encoding/decoding circuitry consumed in encoding/decoding circuitry

  • Encoding scheme taxonomy

Encoding scheme taxonomy

  • Address

Address vs.

  • vs. Data (address streams typically more correlated)

Data (address streams typically more correlated)

  • Add additional lines

Add additional lines

  • Static

Static vs.

  • vs. adaptive

(src: A. Raghunathan, NEC)

Examples: Gray Coding Bus-Invert Coding (add invert line) T0 Coding (auto-increment line)

… 01000001 11111111 11101101 … … 0 01000001 1 00010000 1 00010010 … … 01000001 01000010 01000011 01001001 … 0 01000001 1 01000001 1 01000001 0 01001001

adaptive

C1 C2

Enc/Dec Enc/Dec

Bus

[Stan95,Benini97,Stan97,Ramprasad98, Mussoll98,Chang00,Kim00,Sotiriadis00]

slide-81
SLIDE 81
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: System-level power estimation

[Lidsky96] D. [Lidsky96] D. Lidsky Lidsky and J. M. and J. M. Rabaey Rabaey, , “ “Early Power Exploration Early Power Exploration – – A World Wide Web Application A World Wide Web Application” ”, in Proc. ACM/IEEE Design Automation Conf., pp. 27 , in Proc. ACM/IEEE Design Automation Conf., pp. 27---

  • --32, June

32, June 1996. 1996. [Li98] Y. Li and J. [Li98] Y. Li and J. Henkel Henkel, "A framework for estimating and minimizing energy dissipation , "A framework for estimating and minimizing energy dissipation of embedded HW/SW systems", in Proc. Design Automation Conf., pp

  • f embedded HW/SW systems", in Proc. Design Automation Conf., pp 188

188--

  • 193, June 1998.

193, June 1998. [Lajolo99] M. [Lajolo99] M. Lajolo Lajolo, A. , A. Raghunathan Raghunathan, S. , S. Dey Dey, L. , L. Lavagno Lavagno, and A. , and A. Sangiovanni Sangiovanni-

  • Vincentelli

Vincentelli, "Efficient power estimation techniques for HW/SW systems", in , "Efficient power estimation techniques for HW/SW systems", in Proc. 1999

  • Proc. 1999

Alessandro Volta Memorial International Workshop on Low Power De Alessandro Volta Memorial International Workshop on Low Power Design, March 1999. sign, March 1999. [Pedram99] M. [Pedram99] M. Pedram Pedram and Q. Wu, and Q. Wu, “ “Design considerations for battery powered electronics Design considerations for battery powered electronics” ”, in Proc. ACM/IEEE Design Automation Conf., pp. 861 , in Proc. ACM/IEEE Design Automation Conf., pp. 861---

  • --866, June 1999.

866, June 1999. [Simunic99a] T. [Simunic99a] T. Simunic Simunic, L. , L. Benini Benini, and G. De , and G. De Micheli Micheli, , “ “Cycle Cycle-

  • Accurate Simulation of Energy Consumption in Embedded Systems",

Accurate Simulation of Energy Consumption in Embedded Systems", in Proc. ACM/IEEE Design in Proc. ACM/IEEE Design Automation Conf., pp. 867 Automation Conf., pp. 867---

  • --872, June 1999.

872, June 1999. [Chiasserini99], C. [Chiasserini99], C. Chiasserini Chiasserini and R. R. and R. R. Rao Rao, , “ “Pulsed battery discharge in communication devices Pulsed battery discharge in communication devices” ”, in Proc. , in Proc. MobiCom MobiCom, August 1999. , August 1999. [Martin99] T. Martin and D. P. [Martin99] T. Martin and D. P. Siewiorek Siewiorek, , “ “The impact of battery capacity and memory bandwidth on CPU speed The impact of battery capacity and memory bandwidth on CPU speed setting: A case study setting: A case study” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. Low Power

. Low Power Electronics & Design, pp. 200 Electronics & Design, pp. 200---

  • --205, August 1999.

205, August 1999. [Benini00] L. [Benini00] L. Benini Benini, G. , G. Castelli Castelli, A. , A. Macii Macii, E. , E. Macii Macii, M. , M. Poncino Poncino and R. and R. Scarsi Scarsi, , “ “A Discrete A Discrete-

  • Time Battery Model for High

Time Battery Model for High-

  • Level Power Estimation

Level Power Estimation” ”, in Proc. Design , in Proc. Design Automation and Test Europe, pp. 35 Automation and Test Europe, pp. 35---

  • --39, March 2000.

39, March 2000. [Lajolo00] M. [Lajolo00] M. Lajolo Lajolo, A. , A. Raghunathan Raghunathan, S. , S. Dey Dey, and L. , and L. Lavagno Lavagno, "Efficient power co , "Efficient power co-

  • estimation for System

estimation for System-

  • on
  • n-
  • Chip Design", in Proc. Design Automation & Test

Chip Design", in Proc. Design Automation & Test Europe, March 2000. Europe, March 2000. [DeMicheli00] G. De [DeMicheli00] G. De Micheli Micheli and L. and L. Benini Benini, , “ “System System-

  • level power optimization: Techniques and tools

level power optimization: Techniques and tools” ”, in ACM Trans. Design Automation of Electronic Systems, Vol. 5, , in ACM Trans. Design Automation of Electronic Systems, Vol. 5,

  • No. 2, April 2000.
  • No. 2, April 2000.

[Panigrahi01] D. [Panigrahi01] D. Panigrahi Panigrahi, C. , C. Chiasserini Chiasserini, S. , S. Dey Dey, R. , R. Rao Rao, A. , A. Raghunathan Raghunathan, and K. , and K. Lahiri Lahiri, , “ “Battery life estimation of mobile embedded systems Battery life estimation of mobile embedded systems” ”, in Proc. Int. Conf. , in Proc. Int. Conf. On VLSI Design, January 2001. On VLSI Design, January 2001.

(src: A. Raghunathan, NEC)

slide-82
SLIDE 82
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: Low-power Software

[Su94] C. L. Su, C. Y. [Su94] C. L. Su, C. Y. Tsui Tsui, and A. M. , and A. M. Despain Despain, "Low power architecture design and compilation techniques for , "Low power architecture design and compilation techniques for high high-

  • performance processors", in Proc. IEEE

performance processors", in Proc. IEEE COMPCON", February 1994. COMPCON", February 1994. [Tiwari94a] V. [Tiwari94a] V. Tiwari Tiwari, S. , S. Malik Malik, and A. Wolfe, "Compilation techniques for low energy: An overv , and A. Wolfe, "Compilation techniques for low energy: An overview", in Proc. iew", in Proc. Symp

  • Symp. Low Power Electronics, pp. 38

. Low Power Electronics, pp. 38---

  • --39, October

39, October 1994. 1994. [Tiwari94b] V. [Tiwari94b] V. Tiwari Tiwari, S. , S. Malik Malik, and A. Wolfe, "Power analysis of embedded software: A first st , and A. Wolfe, "Power analysis of embedded software: A first step towards software power minimization," in Proc. Int. Conf. Com ep towards software power minimization," in Proc. Int. Conf. Computer puter-

  • Aided Design, November 1994.

Aided Design, November 1994. [Tiwari94c] V. [Tiwari94c] V. Tiwari Tiwari, S. , S. Malik Malik, and A. Wolfe, "Power analysis of embedded software: A first st , and A. Wolfe, "Power analysis of embedded software: A first step towards software power minimization," IEEE Trans. VLSI System ep towards software power minimization," IEEE Trans. VLSI Systems, s,

  • vol. 2, no. 4, pp. 437
  • vol. 2, no. 4, pp. 437---
  • --445, December 1994.

445, December 1994. [Tiwari96] V. [Tiwari96] V. Tiwari Tiwari and S. and S. Malik Malik and A. Wolfe and T. and A. Wolfe and T. -

  • C. Lee, "Instruction level power analysis and optimization of so
  • C. Lee, "Instruction level power analysis and optimization of software", in Proc. Int. Conf. VLSI Design, pp. 326

ftware", in Proc. Int. Conf. VLSI Design, pp. 326-

  • -328, January 1996.

328, January 1996. [Mehta96] H. Mehta, R. M. Owens, and M. J. Irwin, [Mehta96] H. Mehta, R. M. Owens, and M. J. Irwin, “ “Some issues in gray code addressing Some issues in gray code addressing” ”, in Proc. Great Lakes , in Proc. Great Lakes Symp

  • Symp. On VLSI, pp 178

. On VLSI, pp 178---

  • --180, March 1996.

180, March 1996. [Hsieh97] C. [Hsieh97] C. -

  • T. Hsieh, M.
  • T. Hsieh, M. Pedram

Pedram, G. Mehta, and F. , G. Mehta, and F. Rastgar Rastgar, "Profile , "Profile-

  • driven program synthesis for evaluation of system power dissipat

driven program synthesis for evaluation of system power dissipation," in Proc. Design ion," in Proc. Design Automation Conf., pp. 576 Automation Conf., pp. 576---

  • --581, June 1997.

581, June 1997. [Mehta97] H. Mehta, R. Owens, M. Irwin, R. Chen, and D. [Mehta97] H. Mehta, R. Owens, M. Irwin, R. Chen, and D. Ghosh Ghosh, , “ “Techniques for Low Energy Software Techniques for Low Energy Software” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp 72

. Low Power Electronics & Design, pp 72-

  • -75, August 1997.

75, August 1997. [Yoshida97] Y. Yoshida, B. [Yoshida97] Y. Yoshida, B. – –Y. Song, H.

  • Y. Song, H. Okuhata

Okuhata, and T. , and T. Onoye Onoye, , “ “An Object Code Compression Approach to Embedded Processors", in An Object Code Compression Approach to Embedded Processors", in Proc. Int.

  • Proc. Int. Symp
  • Symp. Low Power

. Low Power Electronics & Design, pp 265 Electronics & Design, pp 265---

  • --268, August 1997.

268, August 1997. [Tomiyama98] H. [Tomiyama98] H. Tomiyama Tomiyama, T. Ishihara, A. Inoue, and H. , T. Ishihara, A. Inoue, and H. Yasuura Yasuura, , “ “Instruction scheduling for power reduction in processor Instruction scheduling for power reduction in processor-

  • based system design

based system design” ”, in Proc. Design , in Proc. Design Automation and Test Europe, pp. 855 Automation and Test Europe, pp. 855---

  • --860, February 1998.

860, February 1998. [Wan98] M. Wan, Y. Ichikawa, D. [Wan98] M. Wan, Y. Ichikawa, D. Lidsky Lidsky, and J. , and J. Rabaey Rabaey, , “ “An energy conscious methodology for early design exploration of An energy conscious methodology for early design exploration of heterogeneous heterogeneous DSPs DSPs” ”, in Proc. Custom , in Proc. Custom Integrated Circuits Conf., pp. 111 Integrated Circuits Conf., pp. 111---

  • --117, May 1998.

117, May 1998. [Benini99a] L. [Benini99a] L. Benini Benini, A. , A. Macii Macii, E. , E. Macii Macii, and M. , and M. Poncino Poncino, , “ “Selective Instruction Compression for Memory Energy Reduction in Selective Instruction Compression for Memory Energy Reduction in Embedded Systems", in Proc. Int. Embedded Systems", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 206

. Low Power Electronics & Design, pp. 206---

  • --211, August 1999.

211, August 1999.

(src: A. Raghunathan, NEC)

slide-83
SLIDE 83
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: Low-power Software

[Lekatsas00a] H. [Lekatsas00a] H. Lekatsas Lekatsas, J. , J. Henkel Henkel, and W. Wolf, , and W. Wolf, “ “Arithmetic Coding for Low Arithmetic Coding for Low-

  • Power Embedded System ", in. Proc. 2000 IEEE Data Compression Co

Power Embedded System ", in. Proc. 2000 IEEE Data Compression Conf., pp. 430 nf., pp. 430---

  • 439, March 2000.

439, March 2000. [Brandolese00] C. [Brandolese00] C. Brandolese Brandolese, W. , W. Fornaciari Fornaciari, F. , F. Salice Salice, and D. , and D. Sciuto Sciuto, , “ “An Instruction An Instruction-

  • level Functionality

level Functionality-

  • Based Energy Estimation Model for 32

Based Energy Estimation Model for 32-

  • bit Microprocessors

bit Microprocessors” ”, ,

  • in. Proc. 2000 ACM/IEEE Design Automation Conf., pp. 346
  • in. Proc. 2000 ACM/IEEE Design Automation Conf., pp. 346---
  • --351, June 2000.

351, June 2000. [Kandemir00] M. [Kandemir00] M. Kandemir Kandemir, N. , N. Vijaykrishnan Vijaykrishnan, M. J. Irwin, and W. Ye, , M. J. Irwin, and W. Ye, “ “Influence of Compiler Optimizations on Power Influence of Compiler Optimizations on Power” ”, in. Proc. 2000 Automation Conf., pp. 304 , in. Proc. 2000 Automation Conf., pp. 304 ---

  • 307, June 2000.

307, June 2000. [Lekatsas00b] H. [Lekatsas00b] H. Lekatsas Lekatsas, J. , J. Henkel Henkel, and W. Wolf, , and W. Wolf, “ “Code Compression for Low Power Embedded System Design ", in. Pro Code Compression for Low Power Embedded System Design ", in. Proc. 2000 ACM/IEEE Design Automation

  • c. 2000 ACM/IEEE Design Automation

Conf., June 2000. Conf., June 2000. [Qu00] G. [Qu00] G. Qu Qu, N. , N. Kawabe Kawabe, K. , K. Usami Usami, and M. , and M. Potkonjak Potkonjak, , “ “Function Function-

  • level Power Estimation Methodology for Microprocessors

level Power Estimation Methodology for Microprocessors” ”, in. Proc. 2000 ACM/IEEE Design , in. Proc. 2000 ACM/IEEE Design Automation Conf., pp. 810 Automation Conf., pp. 810---

  • --813, June 2000.

813, June 2000. [Ye00] W. Ye, N. [Ye00] W. Ye, N. Vijaykrishnan Vijaykrishnan, M. , M. Kandemir Kandemir, and M. J. Irwin, , and M. J. Irwin, “ “The Design and Use of The Design and Use of SimplePower SimplePower: A Cycle : A Cycle-

  • Accurate Energy Estimation Tool

Accurate Energy Estimation Tool” ”, in. Proc. 2000 , in. Proc. 2000 Automation Conf., pp. 304 Automation Conf., pp. 304 ---

  • -- 307, June 2000.

307, June 2000. [Koushanfar00] F. [Koushanfar00] F. Koushanfar Koushanfar, V. , V. Prabhu Prabhu, M. , M. Potkonjak Potkonjak, and J. , and J. Rabaey Rabaey, , “ “Processors for Mobile Applications Processors for Mobile Applications” ”, in Proc. IEEE Int. Conf. On Computer Design, , in Proc. IEEE Int. Conf. On Computer Design, September 2000. September 2000. [Sami00] M. [Sami00] M. Sami Sami, D. , D. Sciuto Sciuto, C. , C. Silvano Silvano, and V. , and V. Zaccaria Zaccaria, , “ “Power Exploration for Embedded VLIW Architectures Power Exploration for Embedded VLIW Architectures” ”, in Proc. IEEE/ACM Int. Conf. On Computer , in Proc. IEEE/ACM Int. Conf. On Computer-

  • Aided

Aided Design, pp. 498 Design, pp. 498---

  • --503, November 2000.

503, November 2000.

(src: A. Raghunathan, NEC)

slide-84
SLIDE 84
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: Low-power Memory and I/O

[Stan95] M. Stan and W. P. Burleson, [Stan95] M. Stan and W. P. Burleson, “ “Bus Bus-

  • invert coding for low power I/O", IEEE Trans. On VLSI Systems, V

invert coding for low power I/O", IEEE Trans. On VLSI Systems, Vol. 3, No. 1, pp. 49

  • l. 3, No. 1, pp. 49---
  • --58, January 1995.

58, January 1995. [Stan97] M. Stan and W. P. Burleson, [Stan97] M. Stan and W. P. Burleson, “ “Low Low-

  • Power Encodings for Global Communication in CMOS VLSI

Power Encodings for Global Communication in CMOS VLSI” ”, IEEE Trans. On VLSI Systems, Vol. 5, No. 4, pp. 444 , IEEE Trans. On VLSI Systems, Vol. 5, No. 4, pp. 444---

  • --455, December 1997.

455, December 1997. [Benini97] L. [Benini97] L. Benini,G Benini,G. De . De Micheli Micheli, E. , E. Macii Macii, D. , D. Sciuto Sciuto, and C. , and C. Silvano Silvano, , “ “Asymptotic zero Asymptotic zero-

  • transition activity encoding for address buses in low

transition activity encoding for address buses in low-

  • power microprocessor

power microprocessor-

  • based systems

based systems” ”, in , in

  • Proc. Great Lakes
  • Proc. Great Lakes Symp
  • Symp. On VLSI, pp. 77

. On VLSI, pp. 77---

  • --82, March 1997.

82, March 1997. [Bahar98] R. I. [Bahar98] R. I. Bahar Bahar, G. , G. Albera Albera, and S. , and S. Manne Manne, "Power and performance tradeoffs using various caching strateg , "Power and performance tradeoffs using various caching strategies", in Proc. Int. ies", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 64

. Low Power Electronics & Design, pp. 64---

  • --69,

69, August 1998. August 1998. [Bellas98] N. [Bellas98] N. Bellas Bellas, I. Hajj, C. Polychronopoulos, and G. , I. Hajj, C. Polychronopoulos, and G. Stamoulis Stamoulis, "Architectural and compiler support for energy reduction in th , "Architectural and compiler support for energy reduction in the memory hierarchy of high performance e memory hierarchy of high performance microprocessors", in Proc. Int. microprocessors", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 70

. Low Power Electronics & Design, pp. 70---

  • --75, August 1998.

75, August 1998. [Ohsawa98] T. [Ohsawa98] T. Ohsawa Ohsawa, K. Kai, and K. Murakami, "Optimizing the DRAM refresh count fo , K. Kai, and K. Murakami, "Optimizing the DRAM refresh count for merged DRAM/Logic r merged DRAM/Logic LSIs LSIs", in Proc. Int. ", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 82

. Low Power Electronics & Design, pp. 82---

  • --87,

87, August 1998. August 1998. [Shin98] Y. Shin, S. [Shin98] Y. Shin, S. -

  • I.
  • I. Chae

Chae and K. Choi, "Partial bus and K. Choi, "Partial bus-

  • invert coding for power optimization of system level bus", in Pr

invert coding for power optimization of system level bus", in Proc. Int.

  • c. Int. Symp
  • Symp. Low Power Electronics & Design, pp. 127

. Low Power Electronics & Design, pp. 127---

  • --129,

129, August 1998. August 1998. [Silva98] J. L. [Silva98] J. L. da da Silva Jr., F. Silva Jr., F. Catthoor Catthoor, D. , D. Verkest Verkest, and H. De Man, "Power exploration for dynamic data types throu , and H. De Man, "Power exploration for dynamic data types through virtual memory management refinement", in Proc. Int. gh virtual memory management refinement", in Proc. Int. Symp

  • Symp. Low

. Low Power Electronics & Design, pp. 311 Power Electronics & Design, pp. 311---

  • --316, August 1998.

316, August 1998. [Zhang98] H. Zhang, and J. [Zhang98] H. Zhang, and J. Rabaey Rabaey, , “ “Low Low-

  • swing Interconnect Interface Circuits

swing Interconnect Interface Circuits” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 161

. Low Power Electronics & Design, pp. 161---

  • --166, August 1998.

166, August 1998. [Givargis98] T. [Givargis98] T. Givargis Givargis and F. and F. Vahid Vahid, , “ “Interface exploration for reduced power in core Interface exploration for reduced power in core-

  • based systems

based systems” ”, in Proc. Int.. , in Proc. Int.. Symp

  • Symp. On System Synthesis, pp. 117

. On System Synthesis, pp. 117---

  • --122, December 1998.

122, December 1998. [Benini98c] L. [Benini98c] L. Benini Benini, G. De , G. De Micheli Micheli, E. , E. Macii Macii, M. , M. Poncino Poncino, and S. , and S. Quer Quer, , “ “Reducing power consumption of core Reducing power consumption of core-

  • based systems by address bus encoding", IEEE Trans. On VLSI Syst

based systems by address bus encoding", IEEE Trans. On VLSI Systems, ems,

  • Vol. 6, No. 4, pp 554
  • Vol. 6, No. 4, pp 554---
  • --562, December 1998.

562, December 1998. [Benini99b] L.. [Benini99b] L.. Benini Benini, A. , A. Macii Macii, E. , E. Macii Macii, M. , M. Poncino Poncino, and R. , and R. Scarsi Scarsi, , “ “Synthesis of low Synthesis of low-

  • overhead interfaces for power
  • verhead interfaces for power-
  • efficient communication over wide buses

efficient communication over wide buses” ”, in Proc. ACM/IEEE , in Proc. ACM/IEEE Design Automation Conf., pp. 128 Design Automation Conf., pp. 128---

  • --133, June 1999.

133, June 1999.

(src: A. Raghunathan, NEC)

slide-85
SLIDE 85
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: Low-power Memory and I/O

[Shiue99] W. [Shiue99] W. Shiue Shiue, and C. , and C. Chakrabarti Chakrabarti, , “ “Memory Exploration for Low Power Embedded systems Memory Exploration for Low Power Embedded systems” ”, in Proc. ACM/IEEE Design Automation Conf., pp 140 , in Proc. ACM/IEEE Design Automation Conf., pp 140---

  • --145, June

145, June 1999. 1999. [Bellas99] N. [Bellas99] N. Bellas Bellas, I. Hajj, and C. Polychronopoulos, , I. Hajj, and C. Polychronopoulos, “ “Using dynamic cache management techniques to reduce energy in a Using dynamic cache management techniques to reduce energy in a high high-

  • performance processor

performance processor” ”, in Proc. , in Proc. Int.

  • Int. Symp
  • Symp. Low Power Electronics & Design, pp. 64

. Low Power Electronics & Design, pp. 64---

  • --69, August 1999.

69, August 1999. [Lee99] L. H. Lee, B. Moyer, and J. [Lee99] L. H. Lee, B. Moyer, and J. Arends Arends, , “ “Instruction fetch energy reduction using loop caches for embedde Instruction fetch energy reduction using loop caches for embedded applications with small tight loops d applications with small tight loops” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 267

. Low Power Electronics & Design, pp. 267---

  • --269, August 1999.

269, August 1999. [Schurgers99] C. [Schurgers99] C. Schurgers Schurgers, F. , F. Catthoor Catthoor, and M. , and M. Engels Engels, , “ “Energy efficient data transfer and storage optimization for a MA Energy efficient data transfer and storage optimization for a MAP turbo decoder module P turbo decoder module” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 76

. Low Power Electronics & Design, pp. 76---

  • --81, August 1999.

81, August 1999. [Chang00] N. Chang, K. Kim, and J. [Chang00] N. Chang, K. Kim, and J. Cho Cho, , “ “Bus Encoding for Low Bus Encoding for Low-

  • Power High

Power High-

  • Performance Memory Systems

Performance Memory Systems” ”, in. Proc. 2000 ACM/IEEE Design Automation Conf., , in. Proc. 2000 ACM/IEEE Design Automation Conf.,

  • pp. 800
  • pp. 800---
  • --805, June 2000.

805, June 2000. [Kim00] K. [Kim00] K. – –W. Kim, K.

  • W. Kim, K. –

–H.

  • H. Baek

Baek, N. , N. Shanbag Shanbag, C. L. Liu, and S. , C. L. Liu, and S. – –M. Kang,

  • M. Kang, “

“Coupling Coupling-

  • driven Signal Encoding Scheme for Low

driven Signal Encoding Scheme for Low-

  • Power Interface Design

Power Interface Design” ”, in Proc. , in Proc. IEEE/ACM Int. Conf. On Computer IEEE/ACM Int. Conf. On Computer-

  • Aided Design, pp. 318

Aided Design, pp. 318---

  • --321, November 2000.

321, November 2000. [Sotiriadis00] P. [Sotiriadis00] P. Sotiriadis Sotiriadis and A. and A. Chandrakasan Chandrakasan, , “ “Bus Energy Minimization by Transition Pattern Coding (TPC) in De Bus Energy Minimization by Transition Pattern Coding (TPC) in Deep Sub ep Sub-

  • Micron Technologies

Micron Technologies” ”, in Proc. , in Proc. IEEE/ACM Int. Conf. On Computer IEEE/ACM Int. Conf. On Computer-

  • Aided Design, pp. 322

Aided Design, pp. 322---

  • --327, November 2000.

327, November 2000.

(src: A. Raghunathan, NEC)

slide-86
SLIDE 86
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: System-level Tradeoffs and Power Management

[Weiser94] M. [Weiser94] M. Weiser Weiser, B. Welch, A. Demers, and S. , B. Welch, A. Demers, and S. Shenker Shenker, "Scheduling for reduced CPU energy", in Proc. 1st , "Scheduling for reduced CPU energy", in Proc. 1st Symp

  • Symp. on Operating Systems Design and Implementation, pp. 13

. on Operating Systems Design and Implementation, pp. 13--

  • -23,

23, November 1994. November 1994. [Nielsen94], L. S. Nielsen, C. [Nielsen94], L. S. Nielsen, C. Niessen Niessen, J. Spars, and K. Van , J. Spars, and K. Van Beerel Beerel, "Low , "Low-

  • power operation using self

power operation using self-

  • timed circuits and adaptive scaling of the supply voltage", IEEE

timed circuits and adaptive scaling of the supply voltage", IEEE Trans. VLSI

  • Trans. VLSI

Systems, vol. 2, no. 4, pp. 391 Systems, vol. 2, no. 4, pp. 391---

  • --397, December 1994.

397, December 1994. [Yao95] F. [Yao95] F. Yao Yao, A. Demers, and S. , A. Demers, and S. Shenker Shenker, ``A scheduling model for reduced CPU energy,'' in Proc. IEEE A , ``A scheduling model for reduced CPU energy,'' in Proc. IEEE Annual Foundations of Computer Science, pp. 374 nnual Foundations of Computer Science, pp. 374--

  • -382, 1995.

382, 1995. [Douglis95] F. [Douglis95] F. Douglis Douglis, P. Krishnan, and B. , P. Krishnan, and B. Bershad Bershad, , “ “Adaptive disk spin Adaptive disk spin-

  • down policies for mobile computers

down policies for mobile computers” ”, in Proc. Second USENIX , in Proc. Second USENIX Symp

  • Symp. On Mobile and Location

. On Mobile and Location-

  • Independent

Independent Computing, pp. 121 Computing, pp. 121---

  • --137, April 1995.

137, April 1995. [Krishnan95] P. Krishnan, P. Long, and J. Vitter, [Krishnan95] P. Krishnan, P. Long, and J. Vitter, “ “Adaptive disk spin Adaptive disk spin-

  • down via optimal rent

down via optimal rent-

  • to

to-

  • buy in probabilistic environments

buy in probabilistic environments” ”, in Proc. Int. Conf. On Machine Learning, pp. 322 , in Proc. Int. Conf. On Machine Learning, pp. 322---

  • --330,

330, July 1995. July 1995. [Govil95] K. [Govil95] K. Govil Govil, E. Chan, and H. Wasserman, "Comparing algorithms for dynamic s , E. Chan, and H. Wasserman, "Comparing algorithms for dynamic speed peed-

  • setting of a low

setting of a low-

  • power CPU", in Proc. 1st Int. Conf. on Mobile Computing and Netw

power CPU", in Proc. 1st Int. Conf. on Mobile Computing and Networking,

  • rking,

November 1995. November 1995. [Srivastava96] M. [Srivastava96] M. Srivastava Srivastava, A. , A. Chandrakasan Chandrakasan, and R. , and R. Brodersen Brodersen, , “ “Predictive system shutdown and other architectural techniques fo Predictive system shutdown and other architectural techniques for energy efficient programmable computation r energy efficient programmable computation” ”, , IEEE Transactions on VLSI Systems, Vol. 4, No. 1, pp. 42 IEEE Transactions on VLSI Systems, Vol. 4, No. 1, pp. 42---

  • --55, March 1996.

55, March 1996. [Chandrakasan96] A. [Chandrakasan96] A. Chandrakasan Chandrakasan, V. , V. Gutnik Gutnik, and T. , and T. Xanthopoulos Xanthopoulos, "Data driven signal processing: An approach for energy efficie , "Data driven signal processing: An approach for energy efficient computing", in Proc. Int. nt computing", in Proc. Int. Symp

  • Symp. Low Power

. Low Power Electronics & Design, August 1996. Electronics & Design, August 1996. [Smith96] W. [Smith96] W. Mangione Mangione-

  • Smith, P.S.

Smith, P.S. Ghang Ghang, S. Nazareth, P. , S. Nazareth, P. Lettieri Lettieri, et.al., "A Low Power Architecture for Wireless Multimedia Syst , et.al., "A Low Power Architecture for Wireless Multimedia Systems: Lessons Learned from Building a Power Hog," in ems: Lessons Learned from Building a Power Hog," in

  • Proc. Int.
  • Proc. Int. Symp
  • Symp. Low Power Electronics & Design

. Low Power Electronics & Design. Monterey, pp.23 . Monterey, pp.23---

  • --28, August 1996.

28, August 1996. [Helmhold96] D. [Helmhold96] D. Helmhold Helmhold, D. Long, and E. Sherrod, , D. Long, and E. Sherrod, “ “Dynamic disk spin Dynamic disk spin-

  • down technique for mobile computing

down technique for mobile computing” ”, in Proc. IEEE Conf. On Mobile Computing, pp. 130 , in Proc. IEEE Conf. On Mobile Computing, pp. 130---

  • --142, November

142, November 1996. 1996. [Dave97] B. Dave, G. [Dave97] B. Dave, G. Lakshminarayana Lakshminarayana, and N. K. , and N. K. Jha Jha, "COSYN: Hardware , "COSYN: Hardware-

  • software co

software co-

  • synthesis of embedded systems," in Proc. Design Automation Conf.

synthesis of embedded systems," in Proc. Design Automation Conf., pp. 703 , pp. 703---

  • --708, June 1997.

708, June 1997. [Kirkovski97] D. [Kirkovski97] D. Kirkovski Kirkovski and M. and M. Potkonjak Potkonjak, "System , "System-

  • level synthesis of low

level synthesis of low-

  • power hard real

power hard real-

  • time systems," in Proc. Design Automation Conf., pp. 697

time systems," in Proc. Design Automation Conf., pp. 697---

  • --702, June 1997.

702, June 1997. [Hwang97] C. [Hwang97] C.-

  • H. Hwang and A. Wu,
  • H. Hwang and A. Wu, “

“A predictive system shutdown method for energy saving of event A predictive system shutdown method for energy saving of event-

  • driven computation

driven computation” ”, in Proc. Int. Conf. Computer , in Proc. Int. Conf. Computer-

  • Aided Design, pp. 28

Aided Design, pp. 28---

  • --32,

32, November 1997. November 1997.

(src: A. Raghunathan, NEC)

slide-87
SLIDE 87
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: System-level Tradeoffs and Power Management

[Henkel98] J. [Henkel98] J. Henkel Henkel and Y. Li, "Energy conscious hardware and Y. Li, "Energy conscious hardware-

  • software partitioning of embedded systems: A case study on an MP

software partitioning of embedded systems: A case study on an MPEG EG-

  • 2 encoder", in Proc. Int.

2 encoder", in Proc. Int. Wkshp

  • Wkshp. Hardware

. Hardware-

  • Software

Software Codesign Codesign, pp 23 , pp 23---

  • --27, March 1998.

27, March 1998. [Hong98] I. Hong, D. [Hong98] I. Hong, D. Kirkowski Kirkowski, G. , G. Qu Qu, M. , M. Potkonjak Potkonjak, and M. , and M. Srivastava Srivastava, "Power optimization of variable voltage core , "Power optimization of variable voltage core-

  • based systems", in Proc. Design Automation Conf., pp. 176

based systems", in Proc. Design Automation Conf., pp. 176---

  • --181,

181, June 1998. June 1998. [Paleologo98] G. [Paleologo98] G. Paleologo Paleologo, L. , L. Benini Benini, A. , A. Bogliolo Bogliolo, and G. De , and G. De Micheli Micheli, "Policy optimization for dynamic power management" in Proc. De , "Policy optimization for dynamic power management" in Proc. Design Automation Conf., pp. 182 sign Automation Conf., pp. 182---

  • --187, June 1998.

187, June 1998. [Benini98a] L. [Benini98a] L. Benini Benini, R. Hodgson, and P. Siegel, "System , R. Hodgson, and P. Siegel, "System-

  • level power estimation and optimization", in Proc. Int.

level power estimation and optimization", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 173

. Low Power Electronics & Design, pp. 173---

  • --178, August 1998.

178, August 1998. [Benini98b] L. [Benini98b] L. Benini Benini, A. , A. Bogliolo Bogliolo, S. , S. Cavallucci Cavallucci, and B. , and B. Ricco Ricco, "Monitoring system activity for OS , "Monitoring system activity for OS-

  • directed dynamic power management", in Proc. Int.

directed dynamic power management", in Proc. Int. Symp

  • Symp. Low Power Electronics &

. Low Power Electronics & Design, pp. 185 Design, pp. 185---

  • --190, August 1998.

190, August 1998. [Ishihara98] T. Ishihara and H. [Ishihara98] T. Ishihara and H. Yasuura Yasuura, "Voltage scheduling problem for , "Voltage scheduling problem for dynamocally dynamocally variable voltage processors", in Proc. Int. variable voltage processors", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 197

. Low Power Electronics & Design, pp. 197---

  • --202,

202, August 1998. August 1998. [Pering98] T. [Pering98] T. Pering Pering, T. , T. Burd Burd, and R. , and R. Brodersen Brodersen, "The simulation and evaluation of dynamic voltage scaling algo , "The simulation and evaluation of dynamic voltage scaling algorithms", in Proc. Int. rithms", in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp. 76

. Low Power Electronics & Design, pp. 76---

  • --81,

81, August 1998. August 1998. [Lettieri99] P. [Lettieri99] P. Lettieri Lettieri, and M.B. , and M.B. Srivastava Srivastava, "Advances in Wireless Terminals," , "Advances in Wireless Terminals," IEEE Personal Communications Magazine, IEEE Personal Communications Magazine, vol vol 6, no 1. pp.6 6, no 1. pp.6-

  • 19, February 1999.

19, February 1999. [Chung99] E. Chung, L. [Chung99] E. Chung, L. Benini Benini, A. , A. Bogliolo Bogliolo, and G. De , and G. De Micheli Micheli, , “ “Dynamic power management for non Dynamic power management for non-

  • stationary service requests

stationary service requests” ”, in Proc. Design and Test Europe, pp. 77 , in Proc. Design and Test Europe, pp. 77---

  • --81, March

81, March 1999. 1999. [Qiu99a] Q. [Qiu99a] Q. Qiu Qiu and M. and M. Pedram Pedram, , “ “Dynamic power management based on continuous Dynamic power management based on continuous-

  • time

time markov markov decision processes decision processes” ”, in Proc. ACM/IEEE Design Automation Conf., pp. 555 , in Proc. ACM/IEEE Design Automation Conf., pp. 555---

  • --561,

561, June 1999. June 1999. [Qiu99b] Q. [Qiu99b] Q. Qiu Qiu, Q. Wu, and M. , Q. Wu, and M. Pedram Pedram, , “ “Stochastic modeling of a power Stochastic modeling of a power-

  • managed system: Construction and optimization

managed system: Construction and optimization” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. Low Power Electronics & Design, pp.

. Low Power Electronics & Design, pp. 194 194---

  • --199, August 1999.

199, August 1999. [Simunic99c] T. [Simunic99c] T. Simunic Simunic, L. , L. Benini Benini, and G. De , and G. De Micheli Micheli, , “ “Event Event-

  • driven power management of portable systems

driven power management of portable systems” ”, in Proc. Int. , in Proc. Int. Symp

  • Symp. System Synthesis, pp. 18

. System Synthesis, pp. 18---

  • --23, November 1999.

23, November 1999. [Lu00] Y. Lu, E. Y. Chung, T. [Lu00] Y. Lu, E. Y. Chung, T. Simunic Simunic, L. , L. Benini Benini, and G. De , and G. De Micheli Micheli, , “ “Quantitative comparison of power management algorithms Quantitative comparison of power management algorithms” ”, in Proc. Design Automation & Test Europe, March , in Proc. Design Automation & Test Europe, March 2000. 2000.

(src: A. Raghunathan, NEC)

slide-88
SLIDE 88
  • J. Henkel, Univ. of Karlsruhe, WS04/05, 2005

http://ces.univ-karlsruhe.de

References: System-level Tradeoffs and Power Management

[DeMicheli00] G. De [DeMicheli00] G. De Micheli Micheli and L. and L. Benini Benini, , “ “System System-

  • level power optimization: Techniques and tools

level power optimization: Techniques and tools” ”, in ACM Trans. Design Automation of Electronic Systems, Vol. 5, , in ACM Trans. Design Automation of Electronic Systems, Vol. 5,

  • No. 2, April 2000.
  • No. 2, April 2000.

[Lee00] S. Lee and T. Sakurai, [Lee00] S. Lee and T. Sakurai, “ “Run Run-

  • time Voltage Hopping for Low

time Voltage Hopping for Low-

  • Power Real

Power Real-

  • Time Systems

Time Systems” ”, in Proc. ACM/IEEE Design Automation Conf., pp. 806 , in Proc. ACM/IEEE Design Automation Conf., pp. 806---

  • --809, June

809, June 2000. 2000. [Kumar00] P. Kumar and M. [Kumar00] P. Kumar and M. Srivastava Srivastava, , “ “Predictive Strategies for Low Predictive Strategies for Low-

  • Power RTOS Scheduling

Power RTOS Scheduling” ”, in Proc. IEEE Int. Conf. On Computer Design, September 2000. , in Proc. IEEE Int. Conf. On Computer Design, September 2000. [Shin00] Y. Shin, K. [Shin00] Y. Shin, K. Choi Choi, and T. Sakurai, , and T. Sakurai, “ “Power Optimization of Real Power Optimization of Real-

  • Time Embedded Systems on Variable

Time Embedded Systems on Variable-

  • Speed Processors

Speed Processors” ”, in Proc. IEEE/ACM Int. Conf. On , in Proc. IEEE/ACM Int. Conf. On Computer Computer-

  • Aided Design, pp. 365

Aided Design, pp. 365---

  • --368, November 2000.

368, November 2000. [Nachtergaele00] L. [Nachtergaele00] L. Nachtergaele Nachtergaele, V. , V. Tiwari Tiwari, and N. , and N. Dutt Dutt, , “ “System and Architecture System and Architecture-

  • Level Power Reduction for Microprocessor

Level Power Reduction for Microprocessor-

  • Based Communication and Multi

Based Communication and Multi-

  • Media Applications

Media Applications” ”, in Proc. IEEE/ACM Int. Conf. On Computer , in Proc. IEEE/ACM Int. Conf. On Computer-

  • Aided Design, pp. 569

Aided Design, pp. 569---

  • --573, November 2000.

573, November 2000. [Rabaey00] J. M. [Rabaey00] J. M. Rabaey Rabaey, M. , M. Potkonjak Potkonjak, F. , F. Koushanfar Koushanfar, S. , S. – –F. Li, and T. Tuan,

  • F. Li, and T. Tuan, “

“Challenges and Opportunities in Broadband and Wireless Communica Challenges and Opportunities in Broadband and Wireless Communication Design tion Design” ”, ,

  • Proc. IEEE/ACM Int. Conf. On Computer
  • Proc. IEEE/ACM Int. Conf. On Computer-
  • Aided Design, pp. 76

Aided Design, pp. 76---

  • --82, November 2000.

82, November 2000.

(src: A. Raghunathan, NEC)