Powernightmares: The Challenge of Efficiently Using Sleep States on - - PowerPoint PPT Presentation

powernightmares the challenge of efficiently using sleep
SMART_READER_LITE
LIVE PREVIEW

Powernightmares: The Challenge of Efficiently Using Sleep States on - - PowerPoint PPT Presentation

Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems Thomas Ilsche, Marcus Hhnel, Robert Schne, Mario Bielert, and Daniel Hackenberg Technische Universitt Dresden 29.08.17 5th Workshop on Runtime and


slide-1
SLIDE 1

29.08.17

Powernightmares: The Challenge of Efficiently Using Sleep States on Multi-Core Systems

Thomas Ilsche, Marcus Hähnel, Robert Schöne, Mario Bielert, and Daniel Hackenberg Technische Universität Dresden

5th Workshop on Runtime and Operating Systems for the Many-core Era

slide-2
SLIDE 2

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Observation

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Systems with continuous energy measurement ¨ Tuned for low idle power consumption ¨ Prolonged phases of excessive power

consumption during idle phases

“Powernightmare”

2

slide-3
SLIDE 3

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Background – Processor

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Each processor is a package ¨ A package comprises multiple cores ¨ Each core has two hardware threads ¨ A hardware thread is called CPU

Image source: http://download.intel.com/pressroom/kits/45nm/penryn_dualcore_txt.jpg

3

slide-4
SLIDE 4

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Background – C-states

¨ Idle power conservation ¨ Increasing latency ¨ Controllable per CPU,

but applied per core

¨ Package C-state

determined by lowest core C-state

¨ Effective use is essential

for low idle power

5th Workshop on Runtime and Operating Systems for the Many-core Era 29.08.17

135 W 38 W 30 W 13 W 20 40 60 80 100 120 140 160 C0 C1E C3 C6

Package C-state TDP of Intel Xeon E5-2690 v3

Package TDP in Watts

Shallower Deeper

Sleep

Lower Higher

C-state

4

slide-5
SLIDE 5

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Background – Linux idle governor

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Selects C-state for CPU

¨ ladder_governor gradually changes C-state ¨ menu_governor is based on a heuristic

¨ Heuristic used to predict idle time

¤ Next timer event with correction factor ¤ Repeatable interval detector (up to 8 data points) ¤ Latency requirement

5

slide-6
SLIDE 6

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Investigation – lo2s

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Uses Linux’ perf infrastructure ¨ Create a trace combining

¤ Active processes using the trace point sched_switch ¤ Selected C-state using the cpu_idle trace point ¤ External power measurements ¤ C-state residency using x86_adapt

¨ Available at https://github.com/tud-zih-energy/lo2s

6

slide-7
SLIDE 7

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Investigation – lo2s

29.08.17

Vampir showing a lo2s trace of a parallel build using make C-state of the cores Power measurement Scheduled processes

5th Workshop on Runtime and Operating Systems for the Many-core Era

7

slide-8
SLIDE 8

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Investigation – Powernightmare

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Up to 3 wakeups needed for correction after a

misprediction by the heuristic

Full duration of a Powernightmare: C-states, system and socket power Zoomed begin of Powernightmare: Scheduled tasks, C-states and socket power

8

slide-9
SLIDE 9

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Triggering the issue

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Code to reliably trigger a Powernightmare

int main() { #pragma omp parallel { #pragma omp barrier while (1) { for (int i = 0; i < 8; i++) { #pragma omp barrier usleep (10); } sleep (10); } } }

9

slide-10
SLIDE 10

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Approaching the problem

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Changing task behavior ¨ Improving the idle time prediction ¨ Biasing the prediction error ¨ C-state selection by hardware ¨ Mitigating the impact

10

slide-11
SLIDE 11

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Impact mitigation approach

¨ Wakeup event in

predicted time interval

¨ Cancel timer ¨ Timer triggers wakeup ¨ Ignore recent residency ¨ Enter high C-state ✔️ Misprediction

corrected

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Set a wakeup timer if huge difference between

next known timer and predicted idle time

Prediction correct Prediction incorrect

11

slide-12
SLIDE 12

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Powernightmare with timer

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Fallback timer corrects wrong C-state selection ¨ Only 10 ms of shallow sleep

Reduced impact of Powernightmare with active fallback timer

12

slide-13
SLIDE 13

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Verification

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Measurements taken over 20 minutes ¨ Trigger workload every 10 seconds

Power consumption during idle and trigger workload

13

slide-14
SLIDE 14

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Production servers?

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Found on node of production HPC system “taurus” ¨ Lustre related pattern every 25 seconds ¨ Triggers one second Powernightmare

Scheduling of Lustre related kernel tasks After Lustre ping several cores remain in C1

14

slide-15
SLIDE 15

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing

Summary

29.08.17 5th Workshop on Runtime and Operating Systems for the Many-core Era

¨ Analyzed pattern of inefficient use of sleep states ¨ Developed a methodology and tools to observe ¨ Investigation shows misprediction in idle governor ¨ Proposed solution to mitigate effect

¨ Discussion with Linux community initiated

¨ Increasing probability with rising number of cores ¨ Effect not limited to HPC Systems

15

slide-16
SLIDE 16

Collaborative Research Center 912: HAEC − Highly Adaptive Energy-Efficient Computing 29.08.17

Any questions?