Outin, Edouard, et al. " Enhancing cloud energy models for - - PowerPoint PPT Presentation

outin edouard et al enhancing cloud energy models for
SMART_READER_LITE
LIVE PREVIEW

Outin, Edouard, et al. " Enhancing cloud energy models for - - PowerPoint PPT Presentation

Outin, Edouard, et al. " Enhancing cloud energy models for optimizing datacenters efficiency. " Cloud and Autonomic Computing (ICCAC), 2015 International Conference on. IEEE, 2015. Reviewed by Cristopher Flagg December 6, 2017


slide-1
SLIDE 1

Outin, Edouard, et al. "Enhancing cloud energy models for optimizing datacenters efficiency." Cloud and Autonomic Computing (ICCAC), 2015 International Conference on. IEEE, 2015.

Reviewed by Cristopher Flagg December 6, 2017

slide-2
SLIDE 2

Objective

  • Minimize Energy Consumption
  • Maintain SLA requirements
  • Nontrivial Multi-Objective Optimization Problem

○ Genetic algorithm to optimize Cloud energy consumption ○ Machine learning to improve fitness function

slide-3
SLIDE 3

Fitness Function - Research Questions

  • Depends on the underlying model
  • RQ1. Do differences exist between the energy simulation

based on hardware specifications and the real data that can be observed?

  • RQ2. Could we use machine learning techniques at

runtime to improve the simulation accuracy?

slide-4
SLIDE 4

Problem Statement

  • Simulation used to model datacenter consumption
  • Accuracy of simulation drives accuracy of modeling
  • Models used in "Analysis" step of MAPE-K
  • Based on Standard Performance Evaluation Corporation

(SPEC) benchmarks of power consumption

slide-5
SLIDE 5

Problem Statement - CloudSim

  • “to provide a generalized and extensible simulation

framework that enables modeling, simulation, and experimentation of emerging Cloud computing infrastructures and application services”

  • Energy model is based on the host CPU utilization
slide-6
SLIDE 6

Problem Statement - GreenCloud

  • Packet level simulator with a strong emphasis on

networking and energy awareness.

  • Independent energy models for each type of resource

(e.g. CPU, RAM, disk, network).

  • Determining coefficients for models is complex and can

not be approximated.

slide-7
SLIDE 7

Problem Statement - SimGrid

  • Study the behavior of large-scale distributed systems such

as Grids, Clouds, HPC or P2P systems

  • SURF Energy Plugin enables accounting for computation

time and dissipated energy

  • Assumes energy consumption is linear with the CPU

utilization

slide-8
SLIDE 8

Problem Statement - iCanCloud

  • Predict the trade-offs between cost and performance of a

given set of applications executed in a specific hardware

  • Supports modeling hardware energy consumption of a

system such as CPUs, memories, disks, PSUs.

  • Based on predefined collections of applications
slide-9
SLIDE 9

Problem Statement - Summary

  • Simulators used in classical analysis step of a MAPE-K
  • Analysis step uses hard coded "static" rules, also called

Event-Condition-Action (ECA) engines

  • This paper uses and manipulates simulators instead of the

ECA engine.

slide-10
SLIDE 10

Problem Statement - Experimental Protocol

  • Google Scholar to identify most cited simulator (CloudSim)
  • Simulators based on the spec.org values for the DELL

PowerEdge R620

  • Request the PDU metrics for this server through SNMP
  • Stress tools to mimic variable server utilization (stress-ng)
  • Two experiments on fresh Ubuntu Server 14.04.2 LTS
slide-11
SLIDE 11

Problem Statement - Bare Metal

  • No hypervisor - Directly

stressing host operating system

  • Average energy

consumption over 120 seconds interval

slide-12
SLIDE 12

Problem Statement - Hypervisor and VM

  • KVM hypervisor with

single large Ubuntu VM

  • When idle,

non-negligible gap between spec.org and measured value

slide-13
SLIDE 13

Problem Statement - RQ1 Revisited

  • CloudSim simulation values not very accurate (based on

the spec.org data)

  • Cannot rely on the CPU metric to predict the Watts

consumed.

slide-14
SLIDE 14

Approach

  • Monitor managed elements of Cloud infrastructure.
  • Analysis determines changes needed to bring the system

in the ideal state ○ more energy-efficient ○ no SLA violations ○ High performance

slide-15
SLIDE 15

Approach

slide-16
SLIDE 16

Approach

  • Genetic algorithm manipulates a Cloud configuration

instanced as a model

  • Fitness function designed to evaluate the energy

consumption (goal of paper)

  • Plan and execute changes from best instance
slide-17
SLIDE 17

Approach - Cloud Model

  • Model inspired by previous experiments
  • Model is mapping of

○ virtual machine placement ○ SLA constraints ○ different hosts load

  • Allows mutations, crossovers and validity checks
slide-18
SLIDE 18

Approach - Cloud Model

  • Uses KMF modeling framework (modeling.kevoree.org)
  • Utilizes model generators
  • Stores time series of models
slide-19
SLIDE 19

Approach - Energy Consumption Model

  • OpenStack Ceilometer compute agent on each node
  • Forwards all the metrics to central agent for aggregation
  • Uses machine learning mechanisms to design a new

energy model for the Cloud datacenter

  • Train our model beforehand
slide-20
SLIDE 20

Approach - Energy Consumption Model

Detailed sequence of actions performed by every compute node agent:

  • On the server, monitor CPU utilization, RAM usage, volume of read and

writes on the disk and volume of network data received and sent.

  • With the PDU we get the corresponding energy consumed by the server
  • Every second we retrieve the metrics from the server and the PDU
  • Metrics collector stores tuple (%cpu, %ram, read, writes, recv, sent, Watts)
slide-21
SLIDE 21

Approach - Energy Consumption Model

slide-22
SLIDE 22

Approach - Energy Consumption Model

  • Multivariate Adaptive Regression Spline
  • Predict the values of a continuous dependent variable from a set of

independent variables

  • Does not assume any particular type or class of relationship (e.g., linear,

logistic, etc.) between the predictor variables and the dependent variable

  • Etotal = ∑' predict(host) + Enetwork
  • Network usage does not change with proportional to traffic load, is related to
  • topology. Model assumes this is a static value
slide-23
SLIDE 23

Experimental Protocol - Validation

  • Gather sparse data for predictions, representing different

utilization levels of the server’s hardware (i.e. CPU, RAM, disk, network)

  • Cloud infrastructure mimics random / variable workloads
  • Stress-ng used to consume server resources
slide-24
SLIDE 24

Experimental Protocol - Sample Data

  • Training data gathered for a given host node
slide-25
SLIDE 25

Experimental Protocol - Energy model results

Ehost is the total energy consumption of a given host

  • cpu refers to the current host CPU utilization
  • ram refers to the current host RAM usage
  • sent denotes the volume of network sent data (in Kb)
slide-26
SLIDE 26

Conclusion - Analysis of Results

?

slide-27
SLIDE 27

Conclusion - Analysis of Results

The results look promising as we get an average error

  • f 3,8% between the effectively measured values and the

predicted ones which improve the accuracy comparing to

  • CloudSim. This result permits to answer positively to RQ2
slide-28
SLIDE 28

Conclusion - Threats to Validity

  • Disk I/O NOT dominant features in the prediction equation

computed by the MARS algorithm ○ Volume of disk operations was quite constant ○ Pure sequential disk access is not realistic

  • NO live migration energy overhead considered
slide-29
SLIDE 29

Conclusion - Questions

  • CloudSim is CPU only. Greencloud takes drives and ram

into account as well, but not reviewed

  • No results, no analysis of missing results