Peering Planning Cooperation w ithout Revealing Confidential I nform - - PowerPoint PPT Presentation

peering planning cooperation w ithout revealing
SMART_READER_LITE
LIVE PREVIEW

Peering Planning Cooperation w ithout Revealing Confidential I nform - - PowerPoint PPT Presentation

Peering Planning Cooperation w ithout Revealing Confidential I nform ation Arman Maghbouleh am at cariden dot com Apricot 2006 Perth, Australia w w w .cariden.com (c) cariden technologies Failover Matrices Cariden 1 APRI COT 2006 The I


slide-1
SLIDE 1

APRI COT 2006

Failover Matrices– Cariden 1

Peering Planning Cooperation w ithout Revealing Confidential I nform ation

Arman Maghbouleh am at cariden dot com

w w w .cariden.com

(c) cariden technologies

Apricot 2006 Perth, Australia

slide-2
SLIDE 2

APRI COT 2006

Failover Matrices– Cariden 2

SEA CHI BOS NYC WDC MIA ATL HST LAX SJC KCY Peer X

7 0 0 6 0 0

X

SEA CHI BOS NYC WDC MIA ATL HST LAX SJC KCY Peer X

1 0 0 Mbps 1 3 0 0 ( congested)

X The I ssue

  • Multi-Homed Neighbor, 2 or more links > 50%
  • Example

– 1000Mbps connections to Peer X in 3 locations – SJC-to-Peer = 600Mbps, NYC = 100, WDC = 600 – SJC-to-Peer link fails

  • Are we in trouble?

? ?

... or ...

slide-3
SLIDE 3

APRI COT 2006

Failover Matrices– Cariden 3

Capacity Planning Utopia

  • Uniform capacity links
  • Diverse connections

(unlikely double failures at Layer 3)

  • Upgrade at 50%

(planning objective is to be resilient to single failures)

slide-4
SLIDE 4

APRI COT 2006

Failover Matrices– Cariden 4

  • Range of capacities
  • Multiple Layer 3 failures
  • Upgrade impediments (money, cable plant, ...)

Capacity Planning Reality

slide-5
SLIDE 5

APRI COT 2006

Failover Matrices– Cariden 5

I GP Different from BGP

  • Failure behavior is predictable
  • Established process for within AS planning

– Gather Data

  • Topology (OSPF, IS-IS, ...)
  • Traffic matrix [ 1]
  • Estimate growth

– Simulate for failures – Perform traffic engineering (optional) [ 2] – Upgrade as necessary

  • Commercial and free tools

[ 1] APRICOT 2005 tutorial: Best Practices for Determining the Traffic Matrix in IP Networks [ 2] APRICOT 2004 tutorial: Traffic Engineering Beyond MPLS

slide-6
SLIDE 6

APRI COT 2006

Failover Matrices– Cariden 6

  • Planning practices not well established
  • BGP decision process complicated
  • Amount of data can be large
  • Failure behavior often depends on someone

else’s network!

– e.g., incoming traffic from a peer

The Trouble w ith BGP subject of this talk

slide-7
SLIDE 7

APRI COT 2006

Failover Matrices– Cariden 7

BGP Path Decision Algorithm [ 1 ]

1. Reachable next hop 2. Highest Weight 3. Highest Local Preference 4. Locally originated routes 5. Shortest AS-path length 6. IGP > EGP > Incomplete 7. Lowest MED 8. EBGP > IBGP 9. Lowest IGP cost to next hop

  • 10. Shortest route reflection cluster list
  • 11. Lowest BGP router ID
  • 12. Lowest peer remote address

[ 1] Junos algorithm shown here. Cisco IOS uses a slightly different algorithm.

Shortest Exit Routing Respect MEDs

slide-8
SLIDE 8

APRI COT 2006

Failover Matrices– Cariden 8

Com m on Routing Policies

  • Shortest Exit

– Often used for sending to peers – Get packet out of network as soon as possible – Local Prefs used to determine which neighbor, IGP costs used to determine which exit

  • Respect MEDs

– Often used for customers who buy transit – Deliver packets closest to destination – Neighbor forwards IGP costs as MEDs (multi-exit discriminators)

slide-9
SLIDE 9

APRI COT 2006

Failover Matrices– Cariden 9

Respect our MEDs Shortest Exit in known network Transit Provider Shortest Exit in unknown network Respect MEDs from unknown Customer Shortest Exit in unknown network Shortest Exit in known network Peer Routing From Remote AS Routing To Remote AS Relationship to Remote AS

Blind Spots

  • Cannot predict behavior when routing depends
  • n other network (see 3 cases below).
slide-10
SLIDE 10

APRI COT 2006

Failover Matrices– Cariden 1 0

Failover Matrices

  • Solution to peering planning blind spots
  • Procedure

– Gather data

  • Topology, Traffic, Routing Configurations

– Simulate knowable effects

  • Generate Failover Matrices

– Share Failover Matrices for unknowables

  • e.g., peer gives failover matrix for traffic it delivers, we

provide peer failover matrix for traffic we deliver

  • Both sides benefit from cooperating
  • AS-Internal information is kept confidential
slide-11
SLIDE 11

APRI COT 2006

Failover Matrices– Cariden 1 1

Failover Matrix Exam ple

52% (912) 48% (388)

  • % Traffic:

fail_SJC 600 100 600 Traffic: no failure

  • 70% (670)

ar2.wdc:ge-2/2 95% (670)

  • ar1.nyc:ge-2/1

1% (606) 10% (610) ar1.sjc:Gig3/2 % Traffic: fail_wdc % Traffic: fail_nyc Node: Interface

Note: 388Mbps= 100Mbps+ (0.48* 600Mbps), 912= 600+ (0.52* 600), ...

slide-12
SLIDE 12

APRI COT 2006

Failover Matrices– Cariden 1 2

Failover Exam ple ( from real netw ork)

Peer Circuit 1: Traffic levels at five minute intervals Peer Circuit 2: Traffic levels at five minute intervals Peer Circuit 3: Traffic levels at five minute intervals Peer Circuit 4: Traffic levels at five minute intervals

  • Circuit 2 fails.

Traffic shifts to circuit 4.

slide-13
SLIDE 13

APRI COT 2006

Failover Matrices– Cariden 1 3

Failover Exam ple ( from real netw ork)

Peer Circuit 1: Traffic levels at five minute intervals Peer Circuit 2: Traffic levels at five minute intervals Peer Circuit 3: Traffic levels at five minute intervals Peer Circuit 4: Traffic levels at five minute intervals

  • Circuit 1 fails. Some traffic shifts to 2 & 4
  • Some “leaks” to other AS’s
slide-14
SLIDE 14

APRI COT 2006

Failover Matrices– Cariden 1 4

Questions

  • How do I calculate a failover matrix?
  • How do I use a failover matrix from a peer?
  • What if my peer does not cooperate?
  • What if a substantial amount of traffic “leaks”

to another AS?

slide-15
SLIDE 15

APRI COT 2006

Failover Matrices– Cariden 1 5

Calculating Failover Matrices

  • Accurate and Detailed[ 1,2]

– Per prefix routing and traffic statistics – Full BGP simulation

  • Simple and Scalable[ 3]

– Traffic matrix based on ingress-egress pairs

  • e.g., Peer1.LAX-AR1.CHI (measure and/ or estimate)

instead of 192.12.3.0/ 24-208.43.0.0/ 16

– Limited simulation model

  • Shortest Path, Respect MEDs
  • “Our” AS plus immediate neighbors

[ 1] “Modeling the routing of an Autonomous System with C-BGP,” B. Quoitin and S. Uhlig, IEEE Network, Vol 19(6), November 2005. [ 2] “Network-wide BGP route prediction for traffic engineering,” N. Feamster and J. Rexford, in Proc. Workshop

  • n Scalability and Traffic Control in IP Networks, SPIE ITCOM Conference, August 2002.

[ 3] Cariden MATE, available at http: / / www.cariden.com.

slide-16
SLIDE 16

APRI COT 2006

Failover Matrices– Cariden 1 6

Using Failover Matrix from Peers

  • Peer calculates failover matrix
  • Peer exports failover matrix

using IP addresses of peering links

  • We import failover matrix
  • We include in a representative

model of peer network

  • Use Failover Matrix in

simulation

slide-17
SLIDE 17

APRI COT 2006

Failover Matrices– Cariden 1 7

Estim ate if Peer not Cooperate

Group own sources based on exit location (4 groups here)

bru1 wsh1 wdc2 tpe3 syd1 snv2 sna1 sfo1 sel2 sea1 sac1 roc1 pty1 phx1 phi1 pao2

  • sa1

ewr1 nyc2 nrt4 nrt1 nqt1 min1 mia1 mex1 man1 lon3 lin1 lba1 lax1 kcy1 jfk1 hou1 hkg1 gru1 fra2 eze1 det1 den2 dal1 cph1 cle1 chi1 cdg2 bos1 bhx1 bbs1 atl1 ams2 ams1 ham1 sin1 tpa1 kul1 sjc2 bru1 wsh1 wdc2 tpe3 syd1 snv2 sna1 sfo1 sel2 sea1 sac1 roc1 pty1 phx1 phi1 pao2

  • sa1

ewr1 nyc2 nrt4 nrt1 nqt1 min1 mia1 mex1 man1 lon3 lin1 lba1 lax1 kcy1 jfk1 hou1 hkg1 gru1 fra2 eze1 det1 den2 dal1 cph1 cle1 chi1 cdg2 bos1 bhx1 bbs1 atl1 ams2 ams1 ham1 sin1 tpa1 kul1 sjc2

Quantify shift (to 3 groups) after failure Assume similar for

  • ther side
  • Valid if topology and traffic distributions are similar
slide-18
SLIDE 18

APRI COT 2006

Failover Matrices– Cariden 1 8

Leaks to Other AS’s

  • Simple option

– Leaks between peers relatively small

  • Ignore

– Shifts between transit providers can be large

  • Equal AS-path length to most destinations:
  • Assume complete shift (easy to model)
  • Accurate option

– Extend model to more than one AS away – Add columns in traffic matrix to designate extra traffic in case of other network failures

INTERNET A B x

slide-19
SLIDE 19

APRI COT 2006

Failover Matrices– Cariden 1 9

W ork in Progress

  • Evaluating goodness of models

– Compare actual failures to models

  • Evaluating goodness of failover estimates

– Work with both sides of a peering arrangement, compare failover estimates to simulations – Compare estimated failover matrices to actual failures

  • Streamlining sharing of information
  • Contact me to participate in the above
slide-20
SLIDE 20

APRI COT 2006

Failover Matrices– Cariden 2 0

Sum m ary

  • Peering/ transit links are some of the most

expensive and difficult to provision links

  • We can improve capacity planning on such

links by modeling the network

  • BGP modeling can be much more complex

than IGP modeling

– Some required information is not even available

  • Failover Matrices provide a simple way to

share information without giving away details

  • Failover Matrices can be estimated using one’s
  • wn network details
slide-21
SLIDE 21

APRI COT 2006

Failover Matrices– Cariden 2 1

Acknow ledgm ents

  • Jon Aufderheide (Global Crossing)
  • Clarence Filsfils (Cisco)