Chiara Orsini, Alistair King, Danilo Giordano, Vasileios Giotsas, Alberto Dainotti alistair@caida.org CAIDA, UC San Diego
a framework for historical analysis and real-4me monitoring of BGP data
a framework for historical analysis and real-4me monitoring of BGP - - PowerPoint PPT Presentation
a framework for historical analysis and real-4me monitoring of BGP data Chiara Orsini, Alistair King, Danilo Giordano, Vasileios Giotsas, Alberto Dainotti alistair@caida.org CAIDA, UC San Diego BGPSTREAM BGP data analysis for the masses
Chiara Orsini, Alistair King, Danilo Giordano, Vasileios Giotsas, Alberto Dainotti alistair@caida.org CAIDA, UC San Diego
a framework for historical analysis and real-4me monitoring of BGP data
2
BGP data analysis for the masses
/bgpstream.caida.org
3
Why BGPStream?
4
Why BGPStream?
wget http://archive.org/xyz/abc/file.mrt
5
Why BGPStream?
wget http://archive.org/xyz/abc/file.mrt bgpdump -m file.mrt | my_parser.py
6
Why BGPStream?
wget http://archive.org/xyz/abc/file.mrt bgpdump -m file.mrt | my_parser.py
7
An overview
8
An overview
Metadata Broker
9
An overview
Metadata Broker User Libraries
10
An overview
…
metadata crawler Public HTTP Data Archives
Metadata Broker User Libraries
11
An overview
…
metadata crawler Public HTTP Data Archives
Metadata Broker User Libraries
metadata query
12
An overview
…
metadata crawler Public HTTP Data Archives
Metadata Broker User Libraries
metadata query MRT data (via HTTP)
13
An overview
…
metadata crawler Public HTTP Data Archives
Metadata Broker User Libraries
metadata query MRT data (via HTTP)
libBGPStream Python API User Code
14
Stacked view
15
Stacked view 1
16
Stacked view 1 2
17
Stacked view 1 2 3
18
libBGPStream
into a single stream
19
ExtracAng informaAon from MRT
peers/prefixes
in a RIB dump
Field Type Function project string project name (e.g., Route Views) collector string collector name (e.g., rrc00) type enum RIB or Updates dump time long time the containing dump was begun position enum first, middle, or last record of a dump time long timestamp of the MRT record status enum record validity flag MRT record struct de-serialized MRT record
BGPStream Record
Table 1: BGPStream elem fields. Field Type Function type enum route from a RIB dump, an- nouncement, withdrawal, or state message time long timestamp of MRT record peer address struct IP address of the VP peer ASN long AS number of the VP prefix* struct IP prefix next hop* struct IP address of the next hop AS path* struct AS path community* struct community attribute
enum FSM state (before the change) new state* enum FSM state (after the change) * denotes a field conditionally populated based on
BGPStream Elem
20
Specifying a stream
21
Specifying a stream
22
Specifying a stream
23
Specifying a stream
24
Consuming the stream
25
Consuming the stream
26
Consuming the stream
27
Studying AS path inflaAon using PyBGPStream
from _pybgpstream import BGPStream, BGPRecord, BGPElem 1 from collections import defaultdict 2 from itertools import groupby 3 import networkx as nx 4 5 stream = BGPStream() 6 as_graph = nx.Graph() 7 rec = BGPRecord() 8 bgp_lens = defaultdict(lambda: defaultdict(lambda: None)) 9 stream.add_filter(’record-type’,’ribs’) 10 stream.add_interval_filter(1438415400,1438416600) 11 stream.start() 12 13 while(stream.get_next_record(rec)): 14 elem = rec.get_next_elem() 15 while(elem): 16 monitor = str(elem.peer_asn) 17 hops = [k for k, g in groupby(elem.fields[’as-path’].split(" "))] 18 if len(hops) > 1 and hops[0] == monitor: 19
20 for i in range(0,len(hops)-1): 21 as_graph.add_edge(hops[i],hops[i+1]) 22 bgp_lens[monitor][origin] = \ 23 min(filter(bool,[bgp_lens[monitor][origin],len(hops)])) 24 elem = rec.get_next_elem() 25 for monitor in bgp_lens: 26 for origin in bgp_lens[monitor]: 27 nxlen = len(nx.shortest_path(as_graph, monitor, origin)) 28 print monitor, origin, bgp_lens[monitor][origin], nxlen 29
AS path length discrepancy PMF
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 lin 10-7 10-6 10-5 10-4 10-3 10-2 0.1 1 2 3 4 5 6 7 8 9 10 11 log AS path length difference[d]
30 LINES OF PYTHON CODE
How many AS paths are longer than the shortest path between two ASes?
28
Python bindings
29
Timely reacAve measurements
community aPribute to request neighbors drop traffic
event (using 50-100 probes per event)
while black-holing in effect
measurements to capture and inves2gate transient rou2ng policies
30
ConAnuous realAme monitoring
BGP data
BGPStream in regular 2me bins
Hijacking of AS137 (GARR) - Jan 2015*
*originally described by Dyn Research: http://research.dyn.com/2015/01/vast-world-of-fraudulent-routing/
31
BGP data analysis for the 1%
/github.com/CAIDA/bgpstream
32
RouAng table size over Ame
2002 2004 2006 2008 2010 2012 2014 2016 100k 200k 300k 400k 500k # IPv4 prefjxes
33
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
34
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
35
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
36
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
37
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
38
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
39
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
40
Transit ASes over Ame
2 1 2 2 2 3 2 4 2 5 2 6 2 7 2 8 2 9 2 1 2 1 1 2 1 2 2 1 3 2 1 4 2 1 5 2 1 6 10 20 30 40 50 60 Transit ASNs % Transit ASNs % (IPv4) # ASNs (IPv4) Transit ASNs % (IPv6) # ASNs (IPv6) 10K 20K 30K 40K 50K 60K # ASNs
(c)
41
42
ConAnuously rebuilding the state of each peer
minute
rou?ng tables every 4, 8 hours respec?vely
UP UP
RIB Application
DOWN
RIB Application
DOWN
update RIB/update RIB/update RIB end RIB start RIB end RIB start State Established State Down Corrupted Record
consistent routing table unavailable routing table
1 2 3 4RIB/update
43
Removing redundancy in updates
update messages
between successive peer rou?ng tables
compared to updates
10M 20M 30M Maximum # BGP elems # diff cells 1 5 10 15 20 25 30 35 40 45 50 55 60 Time interval (min) 0.5M 1.0M 1.5M Average
44
Aligning distributed data into a global view
collectors is available at different ?mes
wait
45
Aligning distributed data into a global view
single architecture
mechanism
Apache Kaka
means excellent scalability
46
Aligning distributed data into a global view
single architecture
mechanism
Apache Kaka
means excellent scalability
47
Aligning distributed data into a global view
single architecture
mechanism
Apache Kaka
means excellent scalability
48
Aligning distributed data into a global view
single architecture
mechanism
Apache Kaka
means excellent scalability
49
Aligning distributed data into a global view
single architecture
mechanism
Apache Kaka
means excellent scalability
50
Aligning distributed data into a global view
single architecture
mechanism
Apache Kaka
means excellent scalability
51
There’s lots to be done
52
Coming soon…
prefix any 1.2.3.0/22 and collector rrc06 and aspath '$681_1444_'
… but, we’d happily repriori?ze things based on your feedback
53
Easier, faster, less error-prone BGP data analysis
/bgpstream.caida.org
/github.com/caida/bgpstream
54
55
Coming Soon…
56
GePng updates dumps
57
… for the moment
58
CLI with parseable ASCII output
59
Real people are using it!
60
Analyze only what you’re interested in