NetKAT—A Formal System for the Verification of Networks
Dexter Kozen Cornell University AutoMathA 2015 Leipzig 7 May 2015
NetKATA Formal System for the Verification of Networks Dexter Kozen - - PowerPoint PPT Presentation
NetKATA Formal System for the Verification of Networks Dexter Kozen Cornell University AutoMathA 2015 Leipzig 7 May 2015 NetKAT Collaborators Carolyn Jane Anderson, Nate Foster, Arjun Guha, Jean-Baptiste Jeannin, Dexter Kozen, Cole
Dexter Kozen Cornell University AutoMathA 2015 Leipzig 7 May 2015
Carolyn Jane Anderson, Nate Foster, Arjun Guha, Jean-Baptiste Jeannin, Dexter Kozen, Cole Schlesinger, and David Walker, NetKAT: Semantic Foundations for Networks. POPL 14. Nate Foster, Dexter Kozen, Matthew Milano, Alexandra Silva, and Laure Thompson, A Coalgebraic Decision Procedure for NetKAT. POPL 15.
“The last bastion of mainframe computing” [Hamilton 2009]
◮ Modern computers
◮ implemented with commodity hardware ◮ programmed using general-purpose languages, standard interfaces
◮ Networks
◮ built and programmed the same way since the 1970s ◮ low-level, special-purpose devices implemented on custom hardware ◮ routers and switches that do little besides maintaining routing tables
and forwarding packets
◮ configured locally using proprietary interfaces ◮ network configuration (“tuning”) largely a black art
Ill-suited to modern data centers and cloud-based applications
◮ Difficult to implement end-to-end routing policies and optimizations
that require a global perspective
◮ Difficult to extend with new functionality ◮ Effectively impossible to reason precisely about behavior
Main idea behind SDN
A general-purpose controller manages a collection of programmable switches
◮ controller can monitor and respond to network events
◮ new connections from hosts ◮ topology changes ◮ shifts in traffic load
◮ controller can reprogram the switches on the fly
◮ adjust routing tables ◮ change packet filtering policies
Controller has a global view of the network Enables a wide variety of applications:
◮ standard applications
◮ shortest-path routing ◮ traffic monitoring ◮ access control
◮ more sophisticated applications
◮ load balancing ◮ intrusion detection ◮ fault tolerance
decoupled, network intelligence and state are logically centralized, and the underlying network infrastructure is abstracted from the applications. As a result, en- terprises and carriers gain unprecedented programma- bility, automation, and network control, enabling them to build highly scalable, flexible networks that readily adapt to changing business needs.” —Open Networking Foundation, Software-Defined Networking: The New Norm for Networks, 2012
A first step: the OpenFlow API [McKeown & al., SIGCOMM 08]
◮ specifies capabilities and behavior of switch hardware ◮ a language for manipulating network configurations ◮ very low-level: easy for hardware to implement, difficult for humans
to write and reason about
◮ is platform independent ◮ provides an open standard that any vendor can implement
◮ Formally Verifiable Networking [Wang & al., HotNets 09] ◮ FlowChecker [Al-Shaer & Saeed Al-Haj, SafeConfig 10] ◮ Anteater [Mai & al., SIGCOMM 11] ◮ Nettle [Voellmy & Hudak, PADL 11] ◮ Header Space Analysis [Kazemian & al., NSDI 12] ◮ Frenetic [Foster & al., ICFP 11] [Reitblatt & al., SIGCOMM 12] ◮ NetCore [Guha & al., PLDI 13] [Monsanto & al., POPL 12] ◮ Pyretic [Monsanto & al., NSDI 13] ◮ VeriFlow [Khurshid & al., NSDI 13] ◮ Participatory networking [Ferguson & al., SIGCOMM 13] ◮ Maple [Voellmy & al., SIGCOMM 13]
Goals:
◮ raise the level of abstraction above hardware-based APIs (OpenFlow) ◮ make it easier to build sophisticated and reliable SDN applications
and reason about them
◮ Frenetic [Foster & al., ICFP 11] [Reitblatt & al., SIGCOMM 12] ◮ NetCore [Guha & al., PLDI 13] [Monsanto & al., POPL 12]
Goals:
and reason about them
NetKAT = Kleene algebra with tests (KAT) + additional specialized constructs particular to network topology and packet switching
Stephen Cole Kleene (1909–1994) (0 + 1(01∗0)∗1)∗ {multiples of 3 in binary}
1 1 1
(ab)∗a = a(ba)∗ {a, aba, ababa, . . .}
a b
(a + b)∗ = a∗(ba∗)∗ {all strings over {a, b}}
a + b
John Horton Conway (1937–)
and Finite Machines. Chapman and Hall, London, 1971.
Idempotent Semiring Axioms p + (q + r) = (p + q) + r p(qr) = (pq)r p + q = q + p 1p = p1 = p p + 0 = p p0 = 0p = 0 p + p = p p(q + r) = pq + pr a ≤ b
△
⇐ ⇒ a + b = b (p + q)r = pr + qr Axioms for ∗ 1 + pp∗ ≤ p∗ q + px ≤ x ⇒ p∗q ≤ x 1 + p∗p ≤ p∗ q + xp ≤ x ⇒ qp∗ ≤ x
Regular sets of strings over Σ
A + B = A ∪ B AB = {xy | x ∈ A, y ∈ B} A∗ =
An = A0 ∪ A1 ∪ A2 ∪ · · · 1 = {ε} = ∅ This is the free KA on generators Σ
Binary relations on a set X
For R, S ⊆ X × X, R + S = R ∪ S RS = R ◦ S = {(u, v) | ∃w (u, w) ∈ R, (w, v) ∈ S} R∗ = reflexive transitive closure of R =
Rn = R0 ∪ R1 ∪ R2 ∪ · · · 1 = identity relation = {(u, u) | u ∈ X} = ∅ KA is complete for the equational theory of relational models
◮ Trace models used in semantics ◮ (min, +) algebra used in shortest path algorithms ◮ (max, ·) algebra used in coding ◮ Convex sets used in computational geometry (Iwano & Steiglitz 90)
a b c d
e f g h
a + e b + f c + g d + h
b c d
f g h
af + bh ce + dg cf + dh
1 1
b c d ∗ = (a + bd∗c)∗ (a + bd∗c)∗bd∗ (d + ca∗b)∗ca∗ (d + ca∗b)∗
a c d
Theorem Any system of n linear inequalities in n unknowns has a unique least solution q1 + p11x1 + p12x2 + · · · p1nxn ≤ x1 . . . qn + pn1x1 + pn2x2 + · · · pnnxn ≤ xn
≤ + P = pij x1 x2 . . . xn x1 x2 . . . xn q1 q2 . . . qn Least solution is P∗q
(K, B, +, ·,∗ , , 0, 1), B ⊆ K
◮ (K, +, ·,∗ , 0, 1) is a Kleene algebra ◮ (B, +, ·, , 0, 1) is a Boolean algebra ◮ (B, +, ·, 0, 1) is a subalgebra of (K, +, ·, 0, 1) ◮ p, q, r, . . . range over K ◮ a, b, c, . . . range over B
+, ·, 0, 1 serve double duty
◮ applied to actions, denote choice, composition, fail, and skip, resp. ◮ applied to tests, denote disjunction, conjunction, falsity, and truth,
resp.
◮ these usages do not conflict!
bc = b ∧ c b + c = b ∨ c
p; q
△
= pq if b then p else q
△
= bp + bq while b do p
△
= (bp)∗b
{b} p {c}
△
⇐ ⇒ bp ≤ pc ⇐ ⇒ bp = bpc ⇐ ⇒ bpc = 0 The Hoare while rule {bc} p {c} {c} while b do p {bc} becomes the universal Horn sentence bcpc = 0 ⇒ c(bp)∗bbc = 0
Deductive Completeness and Complexity
◮ deductively complete over language, relational, and trace models ◮ subsumes propositional Hoare logic (PHL) ◮ deductively complete for all relationally valid Hoare-style rules
{b1} p1 {c1}, . . . , {bn} pn {cn} {b} p {c}
◮ decidable in PSPACE
Applications
◮ protocol verification ◮ static analysis and abstract interpretation ◮ verification of compiler optimizations
◮ Language-theoretic models
◮ K = sets of guarded strings over Σ, T ◮ B = free Boolean algebra generated by T
◮ Relational models
◮ K = binary relations on a set X ◮ B = subsets of the identity relation
◮ Trace models
◮ K = sets of traces s0p0s1p1s2 · · · sn−1pn−1sn ◮ B = traces of length 0
◮ n × n matrices over K, B
◮ a packet π is an assignment of constant values n to fields x ◮ a packet history is a nonempty sequence of packets
π1 :: π2 :: · · · :: πk
◮ the head packet is π1
NetKAT
◮ assignments x ← n
assign constant value n to field x in the head packet
◮ tests x = n
if value of field x in the head packet is n, then pass, else drop
◮ dup
duplicate the head packet
Example
sw = 6 ; pt = 88 ; dest ← 10.0.0.1 ; pt ← 50 “For all packets incoming on port 88 of switch 6, set the destination IP address to 10.0.0.1 and send the packet out on port 50.”
x ← n ; y ← m ≡ y ← m ; x ← n (x = y) assignments to distinct fields may be done in either order x ← n ; y = m ≡ y = m ; x ← n (x = y) an assignment to a field does not affect a different field x = n ; dup ≡ dup ; x = n field values are preserved in a duplicated packet x ← n ≡ x ← n ; x = n an assignment causes the field to have that value x = n ; x ← n ≡ x = n an assignment of a value that the field already has is redundant x ← n ; x ← m ≡ x ← m a second assignment to the same field overrides the first x = n ; x = m ≡ 0 (n = m) (
n x = n) ≡ 1
every field has exactly one value
Standard model of NetKAT is a packet-forwarding model e : H → 2H where H = {packet histories} x ← n(π1 :: σ)
△
= {π1[n/x] :: σ} x = n(π1 :: σ)
△
=
if π1(x) = n ∅ if π1(x) = n dup(π1 :: σ)
△
= {π1 :: π1 :: σ}
p + q(σ)
△
= p(σ) ∪ q(σ) p ; q(σ)
△
=
q(τ) p∗(σ)
△
=
pn(σ) 1(σ)
△
= pass(σ) = {σ} 0(σ)
△
= drop(σ) = ∅
Reachability
◮ Can host A communicate with host B? Can every host
communicate with every other host?
Security
◮ Does all untrusted traffic pass through the intrusion detection
system located at C?
Loop detection
◮ Is it possible for a packet to be forwarded around a cycle in the
network?
Modeling Links
sw = A ; pt = n ; sw ← B ; pt ← m
A B n m
◮ filters out all packets not located at the source end of the link ◮ updates switch and port fields to the location of the target end ◮ this captures the effect of sending the packet across the link ◮ network topology is expressed as a sum of link expressions
Switch behavior for switch A is specified by a NetKAT term sw = A ; pA where pA specifies what to do with packets entering switch A
pA pA A
Example pt = n ; dest = a ; dest ← b ; (pt ← m + pt ← k) Incoming packets on port n with destination a ⇒ modify destination to b and send out on ports m and k Switch policy pA is the sum of all such behaviors for A
Let
◮ t = sum of all link expressions ◮ p = sum of all switch policies
Then
◮ pt = one step of the network ◮ each switch processes its packets, then sends them along links to the
next switch
◮ cross terms vanish! (x ← n; x = m ≡ 0 for n = m) ◮ (pt)∗ = the multistep behavior of the network in which the
single-step behavior is iterated
To check if any packet can travel from A to B given the topology and the switch policies, ask whether sw = A ; t(pt)∗ ; sw = B ≡ 0 (drop).
◮ prefix sw = A filters out packets not at A ◮ suffix sw = B filters out packets not at B
It can be shown that the lhs is equivalent to a sum of terms of the form sw = A ; x1 = n1 ; · · · ; xk = nk ; x1 ← m1 ; · · · ; xk ← mk ; sw = B each describing conditions under which a packet can travel from A to B
To check whether every host in the network can physically communicate with every other host, use switch policies sw = A ;
pt = n ;
pt ← m where
◮ n ranges over all active input ports of A ◮ m ranges over all active output ports of A
Let
◮ q = sum of these policies for all A ◮ t = encoding of the topology
Then check whether (qt)∗ ≡
(sw = A ;
pt = n) ;
(sw ← B ;
pt ← m)
A waypoint between A to B is a location F that all packets must traverse enroute from A to B
◮ modify F’s switch policy to duplicate the head packet:
sw = F ; pF ⇒ sw = F ;dup; pF
◮ this marks traffic through F ◮ check whether
sw = A ; t(pt)∗ ; sw = B ≤ sw = A ; t(pt)∗ ; sw = F ;dup; pF ; t(pt)∗ ; sw = B
◮ true if and only if all packet histories contain a dup generated by
traversing F
A network has a forwarding loop if some packet would endlessly traverse a cycle in the network
◮ frequent source of error ◮ have caused major outages in LANs and the Internet ◮ usually handled by a TTL (time-to-live) field
To check for loops, check if a packet can visits the same state twice: α ; pt(pt)∗ ; α = 0 for each valuation α such that in ; (pt)∗ ; α does not vanish.
◮ traffic isolation ◮ access control ◮ correctness of a compiler that maps a NetKAT expression to a set of
individual flow tables that can be deployed on the switches
Soundness and Completeness [Anderson et al. 14]
◮ ⊢ p = q if and only if p = q
Decision Procedure [Foster et al. 15]
◮ NetKAT coalgebra ◮ efficient bisimulation-based decision procedure ◮ implementation in OCaml ◮ deployed in the Frenetic suite of network management tools
Let e be an expression to be analyzed and let x1, . . . , xk be all fields appearing in e.
◮ A complete assignment is a sequence x1 ← n1; · · · ; xk ← nk ◮ A complete test is a sequence x1 = n1; · · · ; xk = nk
Facts:
◮ Every test is (provably) equivalent to a sum of complete tests. ◮ Every assignment is (provably) equivalent to sum of complete tests
and complete assignments.
◮ The complete tests and complete assignments are in one-to-one
correspondence (one of each for each tuple (n1, . . . , nk))
Let P = {complete assignments} = {p, q, . . .} and At = {complete tests} = {α, β, . . .} Let αp be the complete test corresponding to the complete assignment p Reduced NetKAT axioms: α dup = dup α αα = α pαp = p αβ = 0, α = β αpp = αp
qp = p
Regular sets of NetKAT reduced strings
N = At · P · (dup ·P)∗ For A, B ⊆ N, A + B = A ∪ B AB = {αxyq | αxp ∈ A, αpyq ∈ B} A∗ =
An 1 = {αpp | p ∈ P} 0 = ∅
◮ p ∈ P interpreted as {αp | α ∈ At} ◮ α ∈ At interpreted as {αpα} ◮ dup interpreted as {αpp dup αp | p ∈ P}
Lemma Every string over P, At, and dup is equivalent to a sum of
strings in N
Theorem [Anderson & al. 14]
◮ RegN , the family of regular subsets of N, forms a NetKAT and is
isomorphic to the standard packet-switching model
◮ This is the free NetKAT on generators P and At ◮ The following are equivalent:
◮ NetKAT ⊢ e1 = e2 ◮ e1 = e2 ◮ RegN e1 = e2
A NetKAT automaton is a tuple (S, ε, δ) where ε : S → 2At×At δ : S → SAt×At Acceptance of strings in N = At · P · (dup · P)∗ defined by
◮ Accept(s, αpβ dup x)
△
= Accept(δαβ(s), βx)
◮ Accept(s, αpβ)
△
= εαβ(s)
A NetKAT automaton is a tuple (S, ε, δ) where ε : S → 2At×At δ : S → SAt×At Acceptance of strings in N = At · P · (dup · P)∗ defined by
◮ Accept(s, αpβ dup x)
△
= Accept(δαβ(s), βx)
◮ Accept(s, αpβ)
△
= εαβ(s) The final coalgebra is ε : 2N → 2At×At δ : 2N → (2N )At×At εαβ(A) =
αpβ ∈ A αpβ ∈ A δαβ(A) = {βx | αpβ dup x ∈ A}
E : Exp → 2At×At D : Exp → ExpAt×At E(e1 + e2) = E(e1) + E(e2) D(e1 + e2) = D(e1) + D(e2) E(e1e2) = E(e1) · E(e2) D(e1e2) = D(e1) · I(e2) + E(e1) · D(e2) E(e∗) = E(e)∗ D(e∗) = E(e)∗D(e)I(e∗) Eαβ(b) =
α = β ≤ b
D(b) = 0 Eαβ(p) =
β = αp
D(p) = 0 E(dup) = 0 Dαβ(dup) =
β = α
Theorem
accepted by M is L(e) for some NetKAT expression e.
automaton M with at most |At| · 2ℓ states accepting L(e), where ℓ is the number of occurrences of dup in e.
To check e1 = e2, convert to automata, check bisimilarity
◮ exploits a sparse matrix representation ◮ Hopcroft-Karp union-find data structure to represent bisimilarity
classes
◮ BDDs to represent tests (new — based on Pous, POPL 15) ◮ algorithm is competitive with state of the art
5 10 15 20 25 30 All-Pairs Connectivity Loop Freedom Translation Validation Time to solve (s)
◮ Automata/regular expressions have a key role to play in emerging
platforms for managing software-defined networks
◮ NetKAT is a high-level language for programming and reasoning
about network behavior in the SDN paradigm
◮ based on sound mathematical principles ◮ formal denotational semantics, complete deductive system ◮ efficient bisimulation-based decision procedure ◮ lots of applications and abstractions: reachability, noninterference,
cycle detection, fault tolerance, load balancing, QoS, virtual
◮ Future work:
◮ further optimizations to reduce state space ◮ probabilistic semantics ◮ generating proof artifacts
For papers and code, please visit: http://frenetic-lang.org/