SLIDE 1 On On l low-la latenc ency-ca capable to topologies, and their impact on th the desi sign of
tra-do domain ain ro routing
Nikola Gvozdiev, Stefano Vissicchio, Brad Karp, Mark Handley University College London (UCL)
SLIDE 2
We want low latency!
SLIDE 3
We want low latency! In the datacenter
SLIDE 4 We want low latency! In the datacenter In the enterprise WAN [B4, BwE, SWAN …]
- perator controls both WAN and sources
- … so demands are predictable
SLIDE 5 We want low latency! In the datacenter In the enterprise WAN [B4, BwE, SWAN …]
- perator controls both WAN and sources
- … so demands are predictable
In the ISP this talk
- ISP operator does not control sources
SLIDE 6
How do we get low latency in a loaded ISP network?
SLIDE 7
How do we get low latency in a loaded ISP network?
The topology?
Topology must offer diverse low- latency paths…
SLIDE 8
How do we get low latency in a loaded ISP network?
The topology? The routing?
…and routing system must make good use of those low-latency paths Topology must offer diverse low- latency paths…
SLIDE 9
How do we get low latency in a loaded ISP network?
The topology? The routing?
…and routing system must make good use of those low-latency paths Topology must offer diverse low- latency paths…
SLIDE 10
All possible topologies connecting a set of PoPs
Venn diagram
SLIDE 11
Other topologies SP routing gives lowest possible latency
Shortest-path routing doesn’t yield lowest latency on all topologies
SLIDE 12
Other topologies
SP routing gives lowest possible latency
Shortest-path routing doesn’t yield lowest latency on all topologies
SLIDE 13 Other topologies
SP routing gives lowest possible latency
Shortest-path routing doesn’t yield lowest latency on all topologies
capacity: 1.5
SLIDE 14 Other topologies
SP routing gives lowest possible latency
demands: 1 each
Shortest-path routing doesn’t yield lowest latency on all topologies
capacity: 1.5
SLIDE 15 Other topologies
SP routing gives lowest possible latency
demands: 1 each
Shortest-path routing doesn’t yield lowest latency on all topologies
capacity: 1.5
SLIDE 16 Other topologies
SP routing gives lowest possible latency
Can’t achieve lower latency on a tree!
demands: 1 each
Shortest-path routing doesn’t yield lowest latency on all topologies
capacity: 1.5
SLIDE 17 Let’s improve the topology, let’s add redundancy!
demands: 1 each
Better Topologies May Need Better Routing
capacity: 1.5
SLIDE 18 Congestion inflates latency if both aggregates don’t fit.
demands: 1 each
Better Topologies May Need Better Routing
capacity: 1.5
SLIDE 19 Congestion control makes aggregates fit; hurts throughput
demands: 1 each
Better Topologies May Need Better Routing
capacity: 1.5
SLIDE 20 Other topologies
SP routing gives lowest possible latency
demands: 1 each
Better Topologies May Need Better Routing
capacity: 1.5
SLIDE 21
Other topologies
SP routing gives lowest possible latency
Better Topologies May Need Better Routing
SLIDE 22
Other topologies
Better Topologies May Need Better Routing
Modern TE (B4, MPLS-TE) give lowest possible latency
SLIDE 23 Other topologies
Modern TE (B4, MPLS-TE) give lowest possible latency
demands: 1 each
Better Topologies May Need Better Routing
capacity: 1.5
SLIDE 24 Other topologies Route as much as possible on shortest paths
demands: 1 each
Better Topologies May Need Better Routing
Modern TE (B4, MPLS-TE) give lowest possible latency
capacity: 1.5
SLIDE 25 Other topologies Traffic is split, low-latency top path fully loaded
demands: 1 each
Better Topologies May Need Better Routing
Modern TE (B4, MPLS-TE) give lowest possible latency
capacity: 1.5
SLIDE 26 Other topologies
- Do any topologies fall in
this region?
Do Even Better Topologies Need Even Better Routing?
Modern TE (B4, MPLS-TE) give lowest possible latency
SLIDE 27 Other topologies
- Do any topologies fall in
this region?
- If so, do any of them have
a greater potential to provide low latency?
Modern TE (B4, MPLS-TE) give lowest possible latency
Do Even Better Topologies Need Even Better Routing?
SLIDE 28 Other topologies
- Do any topologies fall in
this region?
- If so, do any of them have
a greater potential to provide low latency?
do poorly on those topologies?
Modern TE (B4, MPLS-TE) give lowest possible latency
Do Even Better Topologies Need Even Better Routing?
SLIDE 29
Limitations of Today’s Routing
Central European network of GTS, 2010
Proof by example. Consider this real-world ISP topology…
SLIDE 30 0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path
CDF over 100 runs of SP routing on synthetic traffic matrices on GTS’s topology
SLIDE 31 0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path
Fraction of all flows in each traffic matrix that cross at least one congested link
SLIDE 32 SP does poorly, as expected
0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path
Each point is a run of the routing system on a different traffic matrix
SLIDE 33 0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path B4
B4 for the win … sort of
Jain, Sushant, et al. "B4: Experience with a globally-deployed software defined WAN." ACM SIGCOMM 2013
SLIDE 34
Where does greedy routing such as B4 go wrong?
You are here!
SLIDE 35
Let’s focus on a small part of the network
Where does greedy routing such as B4 go wrong?
SLIDE 36
Limitations of greedy routing
SLIDE 37
Limitations of greedy routing
1.Allocate as much as possible on shortest path
SLIDE 38
Limitations of greedy routing
V G
1.Allocate as much as possible on shortest path
Local flows between Veszprem (V) and Gyor (G)
SLIDE 39
Limitations of greedy routing
Local flows between Veszprem (V) and Gyor (G)
V G
Through flows
1.Allocate as much as possible on shortest path
SLIDE 40
Limitations of greedy routing
First link on V->G aggregate’s shortest path fills eastbound
V G
1.Allocate as much as possible on shortest path
SLIDE 41
Limitations of greedy routing
First link on V->G aggregate’s shortest path fills eastbound
V G
1.Allocate as much as possible on shortest path 2.Allocate to longer paths
SLIDE 42
Limitations of greedy routing
First link on V->G aggregate’s shortest path fills eastbound
V G
?
1.Allocate as much as possible on shortest path 2.Allocate to longer paths
SLIDE 43
Limitations of greedy routing
First link on V->G aggregate’s shortest path fills eastbound
V G
V->G’s second best path is already full, using it results in congestion!
?
1.Allocate as much as possible on shortest path 2.Allocate to longer paths
SLIDE 44
Limitations of greedy routing
SLIDE 45
Limitations of greedy routing
Rings embedded in a topology can trigger this problem with greedy routing
SLIDE 46 0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path B4
B4 for the win … sort of
SLIDE 47 Can we do better?
0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path B4 Latency-optimal
Yes, a placement which both avoids congestion and minimizes propagation delay does exist!
SLIDE 48 Can we do better?
0.0 0.2 0.4 0.6 Fraction of flows that encounter congestion 0.0 0.2 0.4 0.6 0.8 1.0 CDF Shortest path B4 Latency-optimal
Yes, a placement which both avoids congestion and minimizes propagation delay does exist! So GTS is amenable to low latency. Are other topologies?
SLIDE 49 How might we quantify a topology’s potential for low latency under load?
- Want a metric to capture a topology’s inherent potential
for low latency
- Should be:
- traffic matrix-agnostic
- routing algorithm-agnostic
SLIDE 50 How might we quantify a topology’s potential for low latency under load?
- Want a metric to capture a topology’s inherent potential
for low latency
- Should be:
- traffic matrix-agnostic
- routing algorithm-agnostic
- Want to capture two things:
- topology’s potential for routing around congestion hot spots
- …without incurring long propagation delay
SLIDE 51 How might we quantify a topology’s potential for low latency under load?
- Want a metric to capture a topology’s inherent potential
for low latency
- Should be:
- traffic matrix-agnostic
- routing algorithm-agnostic
- Want to capture two things:
- topology’s potential for routing around congestion hot spots
- …without incurring long propagation delay
We want a metric that rewards alternate paths with short propagation delay
SLIDE 52
Alternate Path Availability (APA)
Y Gbps
SLIDE 53
Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity
SLIDE 54 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
SLIDE 55 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
SLIDE 56 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
Links with viable alternative paths Links on shortest path
1 5
SLIDE 57 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
1 5
SLIDE 58 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
2 5
SLIDE 59 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
3 5
SLIDE 60 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
3 5
Path too long or not enough capacity
SLIDE 61 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
4 5
SLIDE 62 Alternate Path Availability (APA)
Y Gbps
Shortest path: T ms total propagation delay Y Gbps SP capacity Exclude each link on the shortest path; can we route Y Gbps over
- ne or more alternative paths with delay < 1.4 T?
4 5 = 0.8
For this PoP pair 80% of the links on the SP have an alternate path with acceptable low latency
SLIDE 63 Low-latency path diversity (LLPD)
- 1. Compute APA for all PoP pairs
SLIDE 64 Low-latency path diversity (LLPD)
- 2. Compute LLPD =
- 1. Compute APA for all PoP pairs
Fraction of PoP pairs with “good” path availability
SLIDE 65 Low-latency path diversity (LLPD)
number of PoP pairs with APA ≥ 0.7 total number of PoP pairs
- 2. Compute LLPD =
- 1. Compute APA for all PoP pairs
Fraction of PoP pairs with “good” path availability =
SLIDE 66 Low-latency path diversity (LLPD)
number of PoP pairs with APA ≥ 0.7 total number of PoP pairs
- 2. Compute LLPD =
- 1. Compute APA for all PoP pairs
Fraction of PoP pairs with “good” path availability = Empirically derived; metric not sensitive to picking different values
SLIDE 67 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 0.0 0.5 fraction of pairs congested GTS 90th percentile Median
100+ real-world ISP topologies, ranked by low-latency path diversity (LLPD)
SLIDE 68 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 0.0 0.5 fraction of pairs congested GTS 90th percentile Median
Generate TMs for each topology; plot fraction of (Src,Dst) PoP pairs in each TM that crosses at least one congested link
SLIDE 69 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 0.0 0.5 fraction of pairs congested GTS 90th percentile Median
Shortest path routing congests links
Two points per topology: median TM and 90th percentile TM; line shows spread of distribution
SLIDE 70 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 0.0 0.5 fraction of pairs congested GTS 90th percentile Median
Networks with high LLPD offer lots of alternative paths à shortest path routing experiences congestion
Shortest path routing congests links
SLIDE 71 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 0.0 0.5 fraction of pairs congested GTS 90th percentile Median
Networks with high LLPD offer lots of alternative paths à shortest path routing experiences congestion
Shortest path routing congests links
No surprises here. What about B4?
SLIDE 72 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 0.0 0.5 fraction of pairs congested GTS 90th percentile Median
alternative paths
B4 congests networks with high potential for low latency
SLIDE 73 0.0 0.5 fraction of pairs congested GTS 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
alternative paths
total prop delay of all flows total prop delay if all flows routed on SP
B4 congests networks with high potential for low latency
SLIDE 74 0.0 0.5 fraction of pairs congested GTS 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Congestion
B4 congests networks with high potential for low latency
Better Better Propagation delay
SLIDE 75 0.0 0.5 fraction of pairs congested GTS 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
alternative paths
- Some flows routed
- n non-shortest paths
B4 congests networks with high potential for low latency
SLIDE 76 0.0 0.5 fraction of pairs congested GTS 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
alternative paths
- Some flows routed
- n non-shortest paths
- Still incurs
congestion, and precisely on high-LLPD networks!
B4 congests networks with high potential for low latency
SLIDE 77 0.0 0.5 fraction of pairs congested GTS 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
alternative paths
- Some flows routed
- n non-shortest paths
- Still incurs
congestion, and precisely on high-LLPD networks!
B4 congests networks with high potential for low latency
Need a different routing scheme. How about one that prioritizes avoiding congestion above all else?
SLIDE 78
Minimizing utilization avoids congestion
SLIDE 79 Minimizing utilization avoids congestion
33%
- Spread traffic out to leave
spare capacity in case traffic levels increase
called MinMax
propagation delay
SLIDE 80 Minimizing utilization avoids congestion
33%
- Spread traffic out to leave
spare capacity in case traffic levels increase
called MinMax
propagation delay How does MinMax do?
SLIDE 81 MinMax inflates propagation delay
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
designed to avoid congestion
SLIDE 82 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
designed to avoid congestion
paths with high propagation delay
MinMax inflates propagation delay
SLIDE 83 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
designed to avoid congestion
paths with high propagation delay One extreme of the design
MinMax inflates propagation delay
SLIDE 84 Latency-optimal placement
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
and avoids congestion
- Maximizes utilization
- f links on low-delay
paths
SLIDE 85 Latency-optimal placement
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
and avoids congestion
- Maximizes utilization
- f links on low-delay
paths Assume it is possible to compute this at scale, more about that later…
SLIDE 86 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Two extremes of congestion-free routing
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Minimize utilization (MinMax) Minimize propagation delay and avoid congestion
SLIDE 87 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Two extremes of congestion-free routing
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Minimize utilization (MinMax) Minimize propagation delay and avoid congestion
SLIDE 88 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Two extremes of congestion-free routing
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Minimize utilization (MinMax) Minimize propagation delay and avoid congestion
GTS
GTS
SLIDE 89 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Two extremes of congestion-free routing
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Minimize utilization (MinMax) Minimize propagation delay and avoid congestion
GTS
GTS
GTS sees 5x difference in propagation delay!
SLIDE 90 0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Two extremes of congestion-free routing
0.0 0.5 fraction of pairs congested 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 LLPD 1.0 1.2 1.4 latency stretch GTS 90th percentile Median
Minimize utilization (MinMax) Minimize propagation delay and avoid congestion
GTS
GTS
GTS sees 5x difference in propagation delay! Let’s focus on a single traffic matrix
SLIDE 91 0.0 0.2 0.4 0.6 0.8 1.0 link utilization 0.0 0.5 1.0 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
Two extremes of congestion-free routing
Significant delta in prop delay, but mean utilization the same
SLIDE 92 0.0 0.2 0.4 0.6 0.8 1.0 link utilization 0.0 0.5 1.0 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
Links on short prop-delay paths “in demand” in latency-optimal placement
Two extremes of congestion-free routing
SLIDE 93 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
Two extremes of congestion-free routing
SLIDE 94 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
Two extremes of congestion-free routing
All possible congestion-free routing solutions lie in this range
SLIDE 95 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
Two extremes of congestion-free routing
100% utilization? On an ISP?
SLIDE 96 10 20 30 40 50 60 time (s) +3.2e3 1 2 3 4 Gbps
A minute from a core link
Source: CAIDA
SLIDE 97 10 20 30 40 50 60 time (s) +3.2e3 1 2 3 4 Gbps
Mean rate. Could run traffic through a path with this capacity, but long queuing delay.
A minute from a core link
SLIDE 98 10 20 30 40 50 60 time (s) +3.2e3 1 2 3 4 Gbps
Need to allocate headroom to allow for variability
A minute from a core link
SLIDE 99 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
The headroom dial
Not feasible because of variability
SLIDE 100 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30)
The headroom dial
MinMax is one extreme of the headroom dial
SLIDE 101 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 7% headroom
The headroom dial
0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 27% headroom
SLIDE 102 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 7% headroom
The headroom dial
0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 22% headroom
SLIDE 103 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 7% headroom
The headroom dial
0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 15% headroom
SLIDE 104 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 7% headroom
The headroom dial
0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 7% headroom
SLIDE 105 0.6 0.7 0.8 0.9 1.0 link utilization 0.80 0.85 0.90 0.95 1.00 CDF Latency-optimal (mean 0.32) MinMax (mean 0.30) 2% headroom
The headroom dial
Need to allow the minimal amount
- f headroom to cope with variability
SLIDE 106
Towards a low-latency routing system
SLIDE 107 Towards a low-latency routing system
Compute latency-optimal routing solution, sans headroom
- expressed the problem as one big linear program
(largely straightforward)
- efficient iterative solution: add paths, solve, repeat …
- 400+ nodes, less than one second (vs. tens of minutes…)
SLIDE 108 Towards a low-latency routing system
Compute latency-optimal routing solution, sans headroom
- expressed the problem as one big linear program
(largely straightforward)
- efficient iterative solution: add paths, solve, repeat …
- 400+ nodes, less than one second (vs. tens of minutes…)
Tune headroom dial to drive routing as close as possible to optimal solution while avoiding congestion
- predict how aggregates will statistically multiplex on a path
by convolving their past demands
SLIDE 109 Towards a low-latency routing system
Compute latency-optimal routing solution, sans headroom
- expressed the problem as one big linear program
(largely straightforward)
- efficient iterative solution: add paths, solve, repeat …
- 400+ nodes, less than one second (vs. tens of minutes…)
Tune headroom dial to drive routing as close as possible to optimal solution while avoiding congestion
- predict how aggregates will statistically multiplex on a path
by convolving their past demands
More details in the paper!
SLIDE 110 Are high-LLPD networks viable?
- A routing system may not be able to unlock the low-
latency potential of a topology
- LLPD indicates that a topology has good potential for
low latency
SLIDE 111 Are high-LLPD networks viable?
- A routing system may not be able to unlock the low-
latency potential of a topology
- LLPD indicates that a topology has good potential for
low latency
- But will anyone ever really build a modern WAN with
high LLPD?
SLIDE 112 Are high-LLPD networks viable?
0.0 0.2 0.4 0.6 0.8 LLPD 0.0 0.5 fraction of pairs congested Google 90th percentile Median
- Repeated SP experiment, but added Google’s network
SLIDE 113 Are high-LLPD networks viable?
0.0 0.2 0.4 0.6 0.8 LLPD 0.0 0.5 fraction of pairs congested Google 90th percentile Median
- Repeated SP experiment, but added Google’s network
Modern high-LLPD network!
SLIDE 114 Are high-LLPD networks viable?
0.0 0.2 0.4 0.6 0.8 LLPD 0.0 0.5 fraction of pairs congested Google 90th percentile Median
- Repeated SP experiment, but added Google’s network
Modern high-LLPD network! B4 however, does great on that network! Could it be because the routing and the topology co-evolved?
SLIDE 115
What topologies would people build if they knew the routing system would always extract the best from it?
SLIDE 116 Conclusions
- To achieve low latency:
- topology must provide low-latency paths
- the routing system must use them effectively
- State-of-the-art routing falters on high-LLPD topologies—
precisely those with best potential for low latency
- Practical routing approach for high-LLPD topologies:
- Efficient LP solution for optimal traffic placement
- Tune headroom dial to avoid congestion (but as little
toward MinMax as possible)
SLIDE 117 System Design
- Simple, centralized design
SLIDE 118 System Design
- Simple, centralized design
But measure what?
SLIDE 119 Measurements
- Only need measurements per aggregate, not per flow!
- Need to know enough to figure out both long and
short-term variability for each aggregate
- Sampling traffic level 10 times per second is enough
to capture short-term variability due to TCP’s congestion control…
- …since RTTs in the ISP are long (order of 100ms)
- Sampling 10 times per second well within reach of
recent hardware [DevoFlow SIGCOMM 2011]
SLIDE 120 What about prioritization?
- If you can you should definitely prioritize delay-
sensitive traffic
- but identifying this traffic in the ISP setting may not
be trivial, since no single operator controls all sources
- also, what about bandwidth-hungry low-latency
traffic (e.g., VR)