Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades
Andy Curtis joint work with:
- S. Keshav
Alejandro López-Ortiz Tommy Carpenter Mustafa Elsheikh University of Waterloo
Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades - - PowerPoint PPT Presentation
Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades Andy Curtis joint work with: S. Keshav Alejandro Lpez-Ortiz Tommy Carpenter Mustafa Elsheikh University of Waterloo Motivation Data centers critical part of IT
Leveraging Heterogeneity to Reduce the Cost of Data Center Upgrades
Andy Curtis joint work with:
Alejandro López-Ortiz Tommy Carpenter Mustafa Elsheikh University of Waterloo
Motivation
infrastructure
Data centers constantly evolve
midst of data center expansion projects or have just completed a new facility
house
http://www.datacenterknowledge.com/archives/2010/08/16/data-center-industry-expansion-in-full-swing/
Network upgrade motivation
Network upgrade motivation
Al-Fares et al., MDCube
Network upgrade motivation
Al-Fares et al., MDCube
Existing topologies are not flexible enough
Existing topologies are not flexible enough
Existing topologies are not flexible enough
Goal
It should be easy and cost-effective to add capacity to a data center network
Challenging problem
Problem 1
heterogeneous topologies
Problem 2
Problem 1
based on rigid constructions
Problem 1
based on rigid constructions
Solutions:
Two solutions:
LEGUP: output is a heterogeneous Clos network
[Curtis, Keshav, López-Ortiz; CoNEXT 2010]
REWIRE: designs unstructured DCN topologies
[Curtis et al.; INFOCOM 2012]
Two solutions:
LEGUP: output is a heterogeneous Clos network
[Curtis, Keshav, López-Ortiz; CoNEXT 2010]
REWIRE: designs unstructured DCN topologies
[Curtis et al.; INFOCOM 2012]
LEGUP in brief:
LEGUP designs upgraded/expanded networks for legacy data center networks
LEGUP in brief:
LEGUP designs upgraded/expanded networks for legacy data center networks
Input
. . . . . .
LEGUP in brief:
LEGUP designs upgraded/expanded networks for legacy data center networks
Input Output
. . . . . .
. . . . . .
LEGUP in brief:
LEGUP designs upgraded/expanded networks for legacy data center networks
Input Output
. . . . . .
. . . . . .
LEGUP in brief:
LEGUP designs upgraded/expanded networks for legacy data center networks
Input Output
. . . . . . . . . . . .
Difficult optimization problem
Difficult optimization problem
First pass: limit solution space by finding
Clos networks
This is a physical realization of a Clos network
. . . . . . Aggregation Core ToR Internet
Clos networks
We can find a logical topology for this network 4 4 4 4 4 4 4 4 16 16
Heterogeneous Clos networks
Logical topology is a forest 2 8 8 8 8 2
Theoretical contributions
*optimal = uses same link capacity an equivalent stage Clos network
Theoretical contributions
Lemma 1: How to construct all optimal logical
forests for a set of switches
*optimal = uses same link capacity an equivalent stage Clos network
Theoretical contributions
Lemma 1: How to construct all optimal logical
forests for a set of switches
Lemma 2: How to build a physical realization
from a logical forest
*optimal = uses same link capacity an equivalent stage Clos network
Theoretical contributions
Lemma 1: How to construct all optimal logical
forests for a set of switches
Lemma 2: How to build a physical realization
from a logical forest
Theorem: A characterization of heterogeneous
Clos networks
*optimal = uses same link capacity an equivalent stage Clos network
Theoretical contributions
Lemma 1: How to construct all optimal logical
forests for a set of switches
Lemma 2: How to build a physical realization
from a logical forest
Theorem: A characterization of heterogeneous
Clos networks
This is the first optimal heterogeneous topology
*optimal = uses same link capacity an equivalent stage Clos network
Problem 1
heterogeneous topologies
Problem 2
more later...
Problem 1
heterogeneous topologies
Problem 2
heterogeneous Clos
Problem 2
Upgraded network should:
makes sense
Approach: use optimization
LEGUP algorithm
number ToRs
LEGUP summary
spend less than half as much money as a fat-tree for same performance
Two solutions:
LEGUP: output is a heterogeneous Clos network
[Curtis, Keshav, López-Ortiz; CoNEXT 2010]
REWIRE: designs unstructured DCN topologies
[Curtis et al.; INFOCOM 2012]
Can we do better with unstructured networks?
Problem
Problem
Approach
solution
REWIRE
Uses simulated annealing to find a network that:
Subject to:
(thermal, power, space)
REWIRE
Uses simulated annealing to find a network that:
Subject to:
(thermal, power, space)
Bisection bandwidth - Diameter
REWIRE
Uses simulated annealing to find a network that:
Subject to:
(thermal, power, space)
Costs = new cables + moved cables + new switches
Simulated annealing algorithm
Simulated annealing algorithm
No known algorithm to find the bisection bandwidth of an arbitrary network!
Bisection bandwidth computation
Easy for a single cut
Bisection bandwidth computation
Bisection bandwidth computation
bw(S,S’) = link cap(S,S’) min { server rates(S), server rates(S’) }
Bisection bandwidth computation
bw(S,S’) = 4 min { 2, 6 }
Bisection bandwidth computation
Then bisection bandwidth is the min over all cuts
Bisection bandwidth computation
are O(n) cuts
Bisection bandwidth computation
are O(n) cuts
Bisection bandwidth computation
are O(n) cuts
Bisection bandwidth computation
are O(n) cuts
Bisection bandwidth computation
Bisection bandwidth computation
Exponentially many cuts on arbitrary topologies
Bisection bandwidth computation
Exponentially many cuts on arbitrary topologies Need: A min-cut, max-flow type theorem for multi- commodity flow
s t
Bisection bandwidth computation
Need: A min-cut, max-flow type theorem for multi- commodity flow
s1 t1 s2 t2 s3
Bisection bandwidth computation
Bisection bandwidth computation
Theorem [Curtis and López-Ortiz, INFOCOM 2009]:
A network can feasibly route all traffic matrices feasible under the server NIC rates using multipath routing iff all its cuts have bandwidth ≥ a sum dependent
Bisection bandwidth computation
Theorem [Curtis and López-Ortiz, INFOCOM 2009]:
A network can feasibly route all traffic matrices feasible under the server NIC rates using multipath routing iff all its cuts have bandwidth ≥ a sum dependent
We can compute the αi values using linear programming
[Kodialam et al. INFOCOM 2006]
Bisection bandwidth computation
Theorem [Curtis and López-Ortiz, INFOCOM 2009]:
A network can feasibly route all traffic matrices feasible under the server NIC rates using multipath routing iff all its cuts have bandwidth ≥ a sum dependent
We can compute the αi values using linear programming
[Kodialam et al. INFOCOM 2006]
These two theoretical results give us a polynomial-time algorithm to find the bisection bandwidth of an arbitrary network
Evaluation
How much performance do we gain with heterogeneous network equipment?
Evaluation
data center as input
Evaluation: input
. . . . . .
Evaluation: input
So, we add thermal constraints modeling this
Chiller Hot aisle Cold aisle Cold/hot aisle airflow
Evaluation: cost model
Rate Short ($) Medium ($) Long ($) 1 Gb 5 10 20 10 Gb 50 100 200 Install cost 10 20 50
1 Gb ports 10 Gb ports Watts Cost ($) 24 100 250 48 150 1,500 48 4 235 5,000 24 300 6,000 48 600 10,000 144 5000 75,000
Evaluation: comparison methods
most, adds it, and repeats
center network topology
Expanding the Waterloo SCS data center
0.01 0.02 0.03 0.04 0.05 Original Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE
Diameter: 4 4 3 4 3 4 3 4 3 4 2 4 3 4 2 4 2 0 160 320 480 640 Cumulative number of servers added Bisection bandwidth
Starting servers = 760
Oversubscription ratio
Expanding the Waterloo SCS data center
0.01 0.02 0.03 0.04 0.05 Original Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE
Diameter: 4 4 3 4 3 4 3 4 3 4 2 4 3 4 2 4 2 0 160 320 480 640 Cumulative number of servers added Bisection bandwidth Oversubscription ratio
Expanding the Waterloo SCS data center
0.01 0.02 0.03 0.04 0.05 Original Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE Fat-tree 1Gb GREEDY LEGUP REWIRE
Diameter: 4 4 3 4 3 4 3 4 3 4 2 4 3 4 2 4 2 0 160 320 480 640 Cumulative number of servers added Bisection bandwidth Oversubscription ratio
Greenfield network design
Greenfield network design
0.1 0.2 0.3 0.4 Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE
Budget = $125/rack $250/rack $500/rack $1000/rack
Diameter: 4 4 0 4 4 3 4 3 4 3 4 3 4 2
Oversubscription ratio
Greenfield network design
0.1 0.2 0.3 0.4 Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE Fat-tree Random LEGUP REWIRE
Budget = $125/rack $250/rack $500/rack $1000/rack
Diameter: 4 4 0 4 4 3 4 3 4 3 4 3 4 2
Oversubscription ratio
Greenfield network design
Expanding a greenfield network
0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE
1600 2000 2400 2800 3200 Total servers in data center Diameter: 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 2 Bisection bandwidth
Oversubscription ratio
Expanding a greenfield network
0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE
1600 2000 2400 2800 3200 Total servers in data center Diameter: 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 2 Bisection bandwidth
Oversubscription ratio
Expanding a greenfield network
0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40
Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE Fat-tree 1Gb Fat-tree 10Gb LEGUP REWIRE
1600 2000 2400 2800 3200 Total servers in data center Diameter: 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 3 4 4 4 2 Bisection bandwidth
Oversubscription ratio
Are unstructured topologies worth it?
Clos for same cost
(can get 2 hops between racks instead of 4)
[Mudigonda et al., NSDI 2010] to effectively use available
bandwidth
REWIRE future work
Mudigonda et al.,USENIX ATC 2011
algorithm numerically unstable
networks
bisection bandwidth?
Conclusions
heterogeneous networks
algorithms to design heterogeneous DCNs
36 28 20 32 27 41 12 68 39 70 47 23 30 24 64 59 11 13 53 52 5 8