CloudMirror : T enant Network Abstraction that Reflects - - PowerPoint PPT Presentation

cloudmirror t enant network abstraction that reflects
SMART_READER_LITE
LIVE PREVIEW

CloudMirror : T enant Network Abstraction that Reflects - - PowerPoint PPT Presentation

CloudMirror : T enant Network Abstraction that Reflects Applications Needs Myungjin Lee University of Edinburgh In collaboration with: Jeongkeun JK Lee, Lucian Popa, Bryan Stephenson, Yoshio Turner, Sujata Banerjee, Puneet Sharma


slide-1
SLIDE 1

CloudMirror : T enant Network Abstraction that Reflects Applications’ Needs

Myungjin Lee University of Edinburgh

In collaboration with: Jeongkeun “JK” Lee, Lucian Popa, Bryan Stephenson, Yoshio Turner, Sujata Banerjee, Puneet Sharma

slide-2
SLIDE 2

500 1000 1500 2000 2500 100% 92% 83% 79% Web response time (msec)

Bandwidth provision

Need Bandwidth Guarantees for Predictable Performance

  • Big-data applications require high bandwidth
  • Hadoop Sort needs ~500 Mbps
  • Web services have stringent latency requirements
  • Amazon – “Every 100ms latency costs 1% in sales”
  • Insufficient bandwidth leads to sharp increase in response time

250ms bottleneck-free 2 secs browser timeout Wikipedia benchmark

slide-3
SLIDE 3

Network ¡? ¡

No Bandwidth Guarantees

  • Weak or no network SLAs in

public clouds

  • HP Cloud, Amazon EC2, Rackspace,

Azure…

HP Cloud instance types Amazon EC2 instance types

slide-4
SLIDE 4

Goal: Network Abstraction for Expressing Bandwidth Demands

  • Challenge: applications’ complex communication patterns

MS Bing.com datacenter

Source: [Bodik, Sigcomm’12]

?

slide-5
SLIDE 5

Solution: CloudMirror

  • 1. New abstraction for BW guarantees, T

enant Application Graph (TAG)

  • 2. VM placement algorithm that efficiently utilizes network & compute

resources

¡ Pipe ¡ Virtual Cluster (VC) VOC (2-level VC) TAG Ease of use ¡

û û ü ü û û ü ü

Flexibility ¡

û û ü ü ü ü ü ü

Efficiency ¡

û û û û û û ü ü

Algorithm run time for 1K VMs ¡ > 10 mins < 1 sec < 1 sec < 1 sec

2X BW efficiency

slide-6
SLIDE 6

Pipe Model

  • Specifies every

VM-to-VM communication

  • O(n2) pipes, n: # of

VMs

  • Slow: O(n4) algorithm run time
  • Lacks statistical multiplexing
  • Inflexible and inefficient

DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB DB web web web web web web web web web web web web web web web web DB DB web web web DB DB web

B B

Total 2·B bandwidth

DB DB web

Actual demand = B + =

B

slide-7
SLIDE 7

Virtual Cluster Model

  • Hose Model [Duffield, SigComm’99]
  • All

VMs connected to a single virtual switch

  • Pros
  • Per-VM bandwidth: statistical

multiplexing

  • Easy to map on physical

topology

  • Cons
  • Doesn’t capture communication

patterns accurately

  • Leads to inefficient bandwidth

reservation

VMs of one tenant Bandwidth Guarantees Virtual Switch

BX BY BZ X Y Z

slide-8
SLIDE 8

2B Web(N) B

… … …

App(N) DB(N) 2B

Virtual Cluster Example

Virtual Cluster reservation at L2 : 2B · N

3-tier web example B: per-VM per-edge bandwidth N: number of VMs in each tier

Virtual Cluster modeling Physical deployment example 2X bandwidth usage by Virtual Cluster

2 B N

App - DB demand = B · N

Web (N) App (N) DB (N)

B B B Web + App DB L1 L2

slide-9
SLIDE 9

Virtual Oversubscribed Cluster (VOC)

[Ballani, Sigcomm’11]

  • 2-level hierarchical virtual cluster
  • Also inefficient, doesn’t accurately capture general application

structure

NX Bx

… … …

By Bz NY NZ

  • versubscribed

Virtual Cluster Root Virtual Switch

slide-10
SLIDE 10

Intuition: Model the Application, Not the Network

Prior work = model virtual networks Our work, TAG = model applications Application Network

slide-11
SLIDE 11

T enant Application Graph (TAG)

  • TAG is a directional graph
  • Each vertex represents an application component
  • Component: a set of

VMs (or JVMs) performing the same function

  • Each directional edge represents per-VM sending and receiving

bandwidth demands

  • Each web

VM is guaranteed bandwidth B1 for sending traffic to any VMs in DB tier

web (N1) DB (N2) B1 B2 B2

in

slide-12
SLIDE 12

Bandwidth Models in TAG

  • Directional edge between two vertices à

Virtual Trunk

  • Self-edge à

Virtual Cluster

Total guarantee of T1à2 = min(B1·N1, B2·N2) B2 Web(N1) B1

… …

DB(N2)

B2

in

T1à2 Virtual Switch Virtual Trunk

slide-13
SLIDE 13

TAG is Intuitive

  • TAG is easy to use because it directly

mirrors application structure

  • Users don’t need to be concerned

with the network topology

  • VOC requires the user to specify
  • versubscription ratio

3-tier example

Web (N) App (N) DB (N)

B B B

TAG modeling

B B B B

Web (N) App (N) DB (N)

B

  • versubscription

ratio ???

?

slide-14
SLIDE 14

TAG is Efficient

  • Accurately captures communication patterns
  • TAG requires less or equal BW than

VOC

TAG modeling 3-tier example Physical deployment

Web (N) App (N) DB (N)

B B B B B B B

Web (N) App (N) DB (N)

B Web + App DB

B·N B·N

slide-15
SLIDE 15

CloudMirror Operation

VM placement BW reservation

TAG input Network topology & BW reservation state Available VM slots

host1 10 host2 50 host3 25

Web DB

slide-16
SLIDE 16

VM Placement

  • Goal

Map graph-based TAG onto a tree-shaped topology Deploy as many TAGs as possible while guaranteeing SLAs

We b(1) 200 10 App (1) 90 DB (1) Cache (1)

100 100 100

W A C D W A

  • Principle: maximize consolidation

1) Localize traffic and save core bandwidth

  • Place tenant under the smallest feasible subtree

[Ballani, Sigcomm’11]

  • Pack tiers with high inter-tier BW: sized min-cut

problem

2) Fully utilize network & compute resources

  • Place high-BW, low-BW

VMs together: knapsack problem

slide-17
SLIDE 17

Evaluations

  • Methodology
  • Simulating bandwidth reservations and

VM placement given a stream of tenant arrivals

  • Microsoft Bing.com data
  • Various communication patterns
  • Component size: 1 ~ 300

VMs

  • Tenant: a set of connected components
  • 3-level tree topology
  • Modeled after a real HP datacenter
  • 2048 hosts, 50

VM slots per host

Source: [P . Bodik, et. al, Sigcomm’12]

slide-18
SLIDE 18

Results

  • Bandwidth usage
  • Assume no network bottleneck
  • Virtual Cluster consumes 76%

more BW than TAG

  • VM slot util. vs. net. capacity
  • Deploy tenants one by one till first

tenant rejection

Virtual Cluster

slide-19
SLIDE 19

Conclusion

  • TAG models application structure, not physical topology
  • Graph-based
  • Easy to use, efficient and flexible
  • Placement algorithm efficiently maps TAGs on tree-

shaped topology

  • Blurb: SICSA Software Defined Networking Workshop
  • Tentative date: mid/late Sept.
  • A half day event with invited talks, panel discussion, etc.
  • More details will be announced via NGN mailing list

E-mail: myungjin.lee@ed.ac.uk