Network layer (IP) Network layer Transport segment from sending to - - PowerPoint PPT Presentation
Network layer (IP) Network layer Transport segment from sending to - - PowerPoint PPT Presentation
Network layer (IP) Network layer Transport segment from sending to receiving host Network layer protocols in every host, router Many historical examples, but only one really matters Network layer functions 1. Connection setup 5.
Network layer
Transport segment from sending to receiving host Network layer protocols in every host, router Many historical examples, but only one really
matters…
Network layer functions
- 5. Quality-of-service
predictable performance
- 6. Routing
path selection/forwarding
- 7. Addressing
flat vs. hierarchical global vs. local variable vs. fixed length
- X. Fragmentation
break-up packets based on
data-link layer MTU
initially, for interoperability, now
used by adversaries mostly (fragrouter)
- 1. Connection setup
datagram connection-oriented,
host-to-host connection
- 2. Delivery semantics:
Unicast, broadcast,
multicast, anycast
In-order, any-order
- 3. Security
secrecy, integrity,
authenticity
- 4. Demux to upper layer
next protocol transport or network
(tunneling)
The Internet Network layer
Host, router network layer functions:
4-4
forwarding table
Routing protocols
- path selection
- RIP, OSPF, BGP
IP protocol
- addressing conventions
- datagram format
- packet handling conventions
ICMP protocol
- error reporting
- router “signaling”
Transport layer: TCP, UDP Link layer physical layer Network layer
IP datagram format
4-5
ver length 32 bits
data (variable length, typically a TCP
- r UDP segment)
16-bit identifier Internet checksum time to live 32 bit source IP address IP protocol version number header length (bytes) max number remaining hops (decremented at each router) for fragmentation/ reassembly total datagram length (bytes) upper layer protocol to deliver payload to head. len type of service “type” of data flgs fragment
- ffset
upper layer 32 bit destination IP address Options (if any) E.g. timestamp, record route taken, specify list of routers to visit.
how much overhead with TCP?
20 bytes of TCP 20 bytes of IP = 40 bytes + app
layer overhead
- 1. IP connection setup
Hourglass design No support for network layer connections
Unreliable datagram service Out-of-order delivery possible Connection semantics only at higher layer Compare to ATM (asynchronous transfer mode) and
phone network…
Connection-oriented network layers
Circuit abstraction
ATM, frame relay, X.25, phone network Model
Call setup and teardown for each call Guaranteed performance during call
Network support
Every router maintains “state” for each passing circuit Resources allocated per call
application transport network data link physical application transport network data link physical
- 1. Initiate call
- 2. Incoming call
- 3. Accept call
- 4. Call connected
- 5. Data flow begins
- 6. Receive data
Connectionless network layers
Postal service abstraction (Internet)
Model
No call setup or teardown at network layer No service guarantees
Network support
No state within network on end-to-end connections Packets forwarded based on destination host ID
application transport network data link physical application transport network data link physical
- 1. Send data
- 2. Receive data
- 2. IP delivery semantics
No reliability guarantees No ordering guarantees Mostly unicast No broadcast (255.255.255.255) not forwarded No multicast (supported in address space, but no longer
used)
224.0.0.0 to 239.255.255.255 In its heyday…used to broadcast a Rolling Stones concert in
November 1994
Mick Jagger…"I wanna say a special welcome to everyone that's,
uh, climbed into the Internet tonight and, uh, has got into the M-
- bone. And I hope it doesn't all collapse."
Recently, anycast
IP address that has many machines associated with it "Reach any one of them" Done with some routing protocol hacks....
Cloudflare's 1.1.1.1
- 3. IP security
Weak support for integrity
IP header checksum Leaves data integrity to TCP/UDP No support for secrecy, authenticity
No support for secrecy No support for authenticity
Even source IP address can be faked! Hosts trusted to provide legitimate address in packets
(Leads to IP spoofing attacks)
IPsec
Retrofit IP network layer with encryption and
authentication
- 4. IP demux to upper layer
Protocol type field
1 = ICMP 4 = IP in IP 6 = TCP 17 = UDP 41 = IPv6 encapsulation within IPv4 47 = GRE (Generic Routing Encapsulation) 88 = EIGRP 89 = OSPF
https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers
- 5. IP quality of service
“Type-of-service” field intended to support quality
Ignored by most routers for the first 15 years of deployment
Mid 90s: Add circuits to the Internet! (intserv & RSVP)
Provide applications with performance guarantees Per-flow end-to-end QoS support Per-flow signaling and network resource allocation
Failed
Complexity (pinning routes, per-flow signalling and scheduling) Economics
Providers with no incentive to deploy QoS a weak-link property requiring every device on an end-to-end
basis to support
Eventual move towards diffserv priority marking with TOS
bits and over-provisioning
Heavily-used in traffic engineering
- 6. IP routing
Internet routing done via hop-by-hop forwarding based
- n destination IP address
Each router has forwarding table of..
Destination IP => Next-Hop IP address
Each router runs a routing protocol and algorithm to
create forwarding table
Routing protocols and algorithms
Graph abstraction for routing algorithms: Routing algorithms find minimum cost paths through
graph
Goal: determine “good” path (sequence of routers) thru network from source to dest. A E D C B F
2 2 1 3 1 1 2 5 3 5
Types of routing algorithms
Global information
all routers have complete topology, link cost info “link state” algorithms using Dijkstra's shortest-path
algorithm
Decentralized information
router knows physically-connected neighbors, link costs
to neighbors
iterative process of computation, exchange of info with
neighbors
“distance vector” algorithms using Bellman-Ford's,
distributed shortest-path algorithm
Routing issues
Want scale
200 million+ destinations Can’t store all dest’s in routing tables, none of the algorithms
work well at that scale
Link and routing table exchanges would swamp links! Flat routing does not scale
In addition, want administrative autonomy
Internet = network of networks Network admins need to control routing in their own networks
Motivates to hierarchical routing
Key observation: need less information with increasing
distance to destination
Saves table size and reduces update traffic Route changes within PSU are not seen by anyone beyond it
4-17
Internet Routing Hierarchy
Divide network into areas called “autonomous systems” (AS) Each with its own administrative autonomy Within AS Routers run same routing protocol
“Intra-AS” or Interior Gateway routing protocol (IGP)
Each node has
Routes to every other node in area Routes to get to any nodes outside of area
Packets destined outside of area routed to nearest appropriate
border router
Between ASes Border routers run "Inter-AS" or Border Gateway routing protocol
(BGP) with border routers in other AS’s
ASes, the IP addresses they "own", and their location/country well-
known
Important for attribution of attacks and mistakes
Internet Routing Hierarchy
Key: Nodes in A have no information about individual
nodes in either B or C (only aggregate route to them)
More in addressing section… a b b a a C A B d A.a A.c C.b B.a c b c
Internet: Network of networks
At center: “tier-1” ISPs (e.g., Verizon, Sprint,
CenturyLink, AT&T, Cable and Wireless)
National/international coverage
Peer with each other in multiple geographic locations in
major cities
Tier-1 ISP: Level 3 / CenturyLink
We made the list! Which building do most of Portland's packets go through?
…
Pittock Block (where PSU peers with Google)
Named after the same Pittock with the mansion Former electrical substation housed in enormous (and
infamous), sub-basement
http://www.oregonlive.com/portland/index.ssf/2001/05/historic_pittock
_building_hous.html
http://www.oregonlive.com/silicon-
forest/index.ssf/2012/12/the_basement_subterranean_visi.html NW Access Exchange members
https://www.nwax.net/Members
Internet structure: network of networks
“Tier-2” ISPs: smaller (often regional) ISPs “Tier-3” ISPs and local ISPs
Tier 1 ISP Tier 1 ISP Tier 1 ISP
Tier-2 ISP Tier-2 ISP Tier-2 ISP Tier-2 ISP Tier-2 ISP local ISP local ISP local ISP local ISP local ISP Tier 3 ISP local ISP local ISP local ISP
Internet structure: network of networks
a packet passes through many networks!
Tier 1 ISP Tier 1 ISP Tier 1 ISP
Tier-2 ISP Tier-2 ISP Tier-2 ISP Tier-2 ISP Tier-2 ISP local ISP local ISP local ISP local ISP local ISP Tier 3 ISP local ISP local ISP local ISP
Inter-AS routing
Done using BGP (Border Gateway Protocol)
Uses distance-vector style algorithms Treats each AS as a node in a graph
BGP messages exchanged using TCP.
Advantages:
Simplifies BGP
Disadvantages
BGP TCP spoofing attack Congestion control on a routing protocol?
Poor interaction during high load (Code Red)
Lack of trust and authentication in route advertisements
Trust and routing
Route advertisements are not authenticated (no public-
key infrastructure)
Routes that are more specific are preferentially taken Issue
Anyone can advertise a more-specific route to Google to
redirect traffic towards itself
Trust and routing
Like Pakistan did (2008)
Trust and routing
Like Google did to
Japan (2017!)
Trust and routing
Or that Russia did to us?
Not likely to be solved…
But we still keep on trying…
- 7. IP addressing
IP address:
32-bit identifier for host/router network interface
Specified by individual bytes Total IP address size: 4 billion
Associated with an interface
Routers typically have multiple interfaces Host may have multiple interfaces IP addresses associated with interface, not host, router
131.252.220.1 = 10000011 11111100 11011100 00000001 131 252 1 220
IP addressing
IP address:
Network part (high order
bits)
Host part (low order
bits)
What’s a network ?
all interfaces that can
physically reach each
- ther without intervening
router
each interface shares
the same network part
- f IP address
223.1.1.1 223.1.1.2 223.1.1.3 223.1.1.4 223.1.2.9 223.1.2.2 223.1.2.1 223.1.3.2 223.1.3.1 223.1.3.27
network consisting of 3 IP networks (for IP addresses starting with 223, first 24 bits are network address) LAN
Initial allocation
256 networks each with 16 million hosts
Modeled after telecom national networks Routing table with only 256 entries! Problem: one size does not fit all
Then, classful addressing
Split into classes to have smaller networks Class A: 128 networks, 16M hosts
1.0.0.0 to 127.255.255.255
Class B: 16K networks, 64K hosts
128.0.0.0 to 191.255.255.255
Class C: 2M networks (!), 256 hosts
192.0.0.0 to 223.255.255.255
Multicast + reserved
224.0.0.0 to 255.255.255.255
Initial IP address classes
Network ID Host ID
8 16
Class A
32
Class B 10 Class C 110 Multicast Addresses Class D 1110 Reserved for experiments Class E 1111
24
Network ID Network ID Host ID Host ID
1.0.0.0 to 127.255.255.255 128.0.0.0 to 191.255.255.255 192.0.0.0 to 223.255.255.255 224.0.0.0 to 239.255.255.255
Special IP Addresses: Loopback
127.0.0.1: localhost
The self-talk address The "lo" interface via ifconfig
catron <~> 11:47AM % ifconfig eno1 Link encap:Ethernet HWaddr 98:90:96:d8:56:e7 inet addr:10.218.103.22 Bcast:10.218.103.255 Mask:255.255.255.0 inet6 addr: fe80::9a90:96ff:fed8:56e7/64 Scope:Link inet6 addr: 2610:10:20:1103::22/128 Scope:Global UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 … lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 … catron <~> 11:48AM %
Special IP Addresses: Private
Private addresses (not globally routable)
Class A: 10.0.0.0 - 10.255.255.255 (10.0.0.0/8 prefix) Class B: 172.16.0.0 - 172.31.255.255 (172.16.0.0/12 prefix) Class C: 192.168.0.0 - 192.168.255.255 (192.168.0.0/16
prefix)
Used to number internal IPv4 addresses on some PSU
machines (see previous Particle lab machine catron)
Can I reach catron via IPv4 outside of PSU? pucca <~> 12:06PM % nslookup catron.cs.pdx.edu Server: 127.0.1.1 Address: 127.0.1.1#53 Non-authoritative answer: Name: catron.cs.pdx.edu Address: 10.218.103.22 pucca <~> 11:43AM % ssh catron.cs.pdx.edu ssh: connect to host catron.cs.pdx.edu port 22: Network is unreachable [1] 10085 exit 255 ssh -Y catron.cs.pdx.edu pucca <~> 12:06PM %
Special IP Addresses: Private
Must go through a machine that has a globally routable
IP address
pucca <~> 11:44AM % ssh linuxlab.cs.pdx.edu wuchang@linuxlab.cs.pdx.edu's password: Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-116-generic x86_64) … Last login: Wed Oct 18 12:48:13 2017 from 10.200.81.21 king <~> 11:44AM % ssh catron.cs.pdx.edu Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.4.0-116-generic x86_64) … Last login: Thu Apr 5 10:06:07 2018 from 2610:10:20:1130::1004 catron <~> 11:44AM %
Special IP Addresses: Private
All Google Cloud internal interfaces use them
Lab uses 192.168.*.*
Classful IP addressing problems
#1: Inefficient use of address space
Class A (rarely given out, sparse usage) Class B = 64k hosts (sparse usage)
Very few LANs have close to 64K hosts
#2: Address space depletion
Class C addresses used heavily, little left to give out
#3: Explosion of routes
Increasing use of class C explodes # of routes Total routes potentially > 2,113,664 networks and network
routes!
IPv4 addressing problems (2012)
Solution: CIDR
CIDR: Classless Inter-Domain Routing
Arbitrarily aggregate and split up adjacent network
addresses
Large blocks (Class A/B) split to increase usage (subnetting) Small blocks (Class C) combined to reduce routes (supernetting)
Done throughout routing infrastructure
variable network part
11001000 00010111 00010000 00000000
host part 200.23.16.0/23 Single integer used to demark network and host parts
Split the following network into 4 equal subnetworks
131.252.0.0/22 Expand out address…
10000011 . 11111100 . 00000000 . 00000000
Q1: How many hosts are on this network? Q2: How many hosts will be on each subnetwork?
Split into 4 parts using next 2 significant bits
10000011 . 11111100 . 00000000 . 00000000 10000011 . 11111100 . 00000001 . 00000000 10000011 . 11111100 . 00000010 . 00000000 10000011 . 11111100 . 00000011 . 00000000
Solution
131.252.0.0/24 131.252.1.0/24 131.252.2.0/24 131.252.3.0/24
Subnetting walkthrough
With your lab partner (or person sitting next to you), split
the following network into 16 equal subnetworks
131.252.128.0/17 10000011 . 11111100 . 10000000 . 00000000
Subnetting problem
Combine the following class C networks into one larger
network
131.252.0.0/24 131.252.1.0/24
Supernetting walkthrough
Answer: 10000011.11111100.00000000.* 10000011.11111100.00000001.* 10000011.11111100.0000000*.* 131.252.0.0/23
Can you combine the following class C networks into a
larger /23?
131.252.1.0/24 131.252.2.0/24
No, they do not share the same address prefix! Ranges must be aligned properly to be supernetted.
Only (131.252.0.0/24 + 131.252.1.0/24) and
(131.252.2.0/24 + 131.252.3.0/24) can be combined into a larger /23.
Supernetting walkthrough
10000011.11111100.00000001.* 10000011.11111100.00000010.* 10000011.11111100.00000000.* 10000011.11111100.00000001.* 10000011.11111100.00000010.* 10000011.11111100.00000011.* 131.252.0.0/23 131.252.2.0/23
With your lab partner (or person sitting next to you),
combine the following class C networks into one larger network
131.252.0.0/24 131.252.1.0/24 131.252.2.0/24 131.252.3.0/24 131.252.4.0/24 131.252.5.0/24 131.252.6.0/24 131.252.7.0/24
Supernetting problem
200.23.16.0/24, 200.200.17.0/24 200.23.18.0/24, 200.200.19.0/24 200.23.20.0/24, 200.200.21.0/24 200.23.22.0/24, 200.200.23.0/24
ISP X given 16 class C networks (200.23.16.* to 200.23.31.*) Can advertise a single CIDR route to ISP W (200.23.16.0/20)
Large company 200.23.16.0/21 Medium company 200.23.24.0/22 200.23.24.0/24 200.23.25.0/24 200.23.26.0/24 200.23.27.0/24 Small company 200.23.28.0/23 200.23.28.0/24 200.23.29.0/24 Tiny company 200.23.30.0/24 ISP W ISP X Route Interface 200.23.16.0/20 1 1 Route Interface 200.23.16/21 2 200.23.24/22 3 200.23.28/23 4 200.23.30/24 5 200.23.31/24 unused 1 2 3 4 5
CIDR route aggregation
200.23.30.0/24
CIDR and IP forwarding
CIDR disadvantage
Routing protocols must now carry prefix length with
destination network address
Makes route lookup algorithm more complex
Before CIDR
O(1) table lookup based on class (A,B,C)
After CIDR
One table containing many prefix lengths and overlapping ranges Also, routes can overlap now. Rule: When a destination IP address matches several routes,
choose the one that is most specific
Why?
Consider multi-homing for tiny company going to ISP Y
200.23.16.0/20 through ISP X to ISP W, but tiny company would
advertise 200.23.30.0/24 to ISP Y, which advertises it to ISP W
200.23.30.0/24 is more specific than 200.23.16.0/20, so packets
going to tiny company go through ISP Y
Problem: Rogue injection of more specific routes you
don't own (see Pakistan-YouTube routing outage)
200.23.16.0/21 200.23.24.0/22 200.23.28.0/23 200.23.30.0/24 ISP W ISP X Route Interface 200.23.16.0/20 1 200.23.30.0/24 2 1 1 2 3 4 5 ISP Y 2
Which interface would packets with the above
destinations go out?
11001000 00010111 00010110 10100001 11001000 00010111 00011000 10101010
Longest prefix matching problem
Route Prefix Link Interface 11001000 00010111 00010 0 11001000 00010111 00011000 1 11001000 00010111 00011 2 default 3
IP Address Problem #4 (1994)
Even with CIDR, address space running out Network Address Translation (NAT)
Alternate solution to address space depletion problem Sits between your network and the Internet Dynamically rewrite source address and/or source
transport layer port (NAPT) on connections to the Internet
“Statistically multiplex” address/port usage across multiple machines Replaces local, private, source IP address/port to global IP/port Makes it appear that all connections coming from a single IP address
NAT with port translation
10.0.0.1 10.0.0.2 10.0.0.3 10.0.0.4 138.76.29.7
local network (e.g., home network) 10.0.0.0/24 rest of Internet Datagrams with source or destination in this network have 10.0.0.0/24 address for source, destination (as usual) All datagrams leaving local network have same single source NAT IP address: 138.76.29.7, different source port numbers 16-bit transport layer port-number field allows for 64k simultaneous connections with one global IP address
NAT advantages
Only a single IP address needed from ISP to network
multiple devices
Can change addresses of devices in local network
without notifying outside world
Can change ISP without changing addresses of devices
in local network
Devices inside local net not explicitly addressable, visible
by outside world (a security plus).
NAT example
10.0.0.1 10.0.0.2 10.0.0.3
S: 10.0.0.1, 3345 D: 128.119.40.186, 80
1
10.0.0.4 138.76.29.7
1: host 10.0.0.1 sends datagram to 128.119.40.186:80 NAT translation table WAN side addr LAN side addr 138.76.29.7, 5001 10.0.0.1, 3345 …… ……
S: 128.119.40.186, 80 D: 10.0.0.1, 3345
4
S: 138.76.29.7, 5001 D: 128.119.40.186, 80
2 2: NAT router changes datagram source addr from 10.0.0.1:3345 to 138.76.29.7:5001, updates table
S: 128.119.40.186, 80 D: 138.76.29.7, 5001
3 3: Reply arrives
- dest. address:
138.76.29.7:5001 4: NAT router changes datagram dest addr from 138.76.29.7:5001 to 10.0.0.1:3345
NAT issue #1: No inbound connection
Must be taken into account for P2P
applications
Incoming connections Client wants to connect to server
at address 10.0.0.1
Server has private LAN address
not reachable externally
Only externally visible address:
138.76.29.7
Solution 1: statically configure NAT
to forward incoming connection requests at given port to server
e.g., (138.76.29.7, port 2500)
always forwarded to 10.0.0.1 port 25000
Or use DMZ host
10.0.0.1 10.0.0.4
NAT router
138.76.29.7
Client ?
NAT issue #1: No inbound connection
Solution 2: Universal Plug and
Play (UPnP), Internet Gateway Device (IGD) Protocol. Allows NATted host to:
learn public IP address
(138.76.29.7)
enumerate existing port
mappings
add/remove port mappings
(with lease times)
i.e., automate static NAT port
map configuration
10.0.0.1 10.0.0.4
NAT router
138.76.29.7
IGD
NAT issue #1: No inbound connection
Solution 3: relaying (used in Skype) NATed server establishes connection to relay External client connects to relay Relay bridges packets between to connections Great only for surveillance
10.0.0.1
NAT router
138.76.29.7
Client
- 1. connection to
relay initiated by NATted host
- 2. connection to
relay initiated by client
- 3. relaying
established
NAT issue #1: No inbound connection
Solution 4: STUN (initially in Skype) Attempt to simultaneously connect via NAT router detection Skype clients contact Skype relay with multiple connections Relay determines port allocation algorithm for each router Coordinates clients to use simultaneous outgoing connections
to each other to establish call
10.0.0.1
NAT router
10.0.0.1
NAT router Not what the designers of the Internet had in mind…
Implicit assumption that network header is unchanged in
network
Key feature that allows one to deploy any application
without coordinating with network infrastructure
Breaks applications that assume network only touches
layer 3
New applications can not make the same assumption Application protocols must never carry IP addresses
ftp's PORT command
To initiate file transfer, client sends its IP address and a port number
for ftp server to connect to
With client behind a NAT, private address sent! NAT breaks protocol by breaking network transparency Details in extra slides
NAT issue #2: Loss of transparency
IPv6
Address shortage should instead be solved by IPv6 Expands address space without using NAT Redesign protocol What changes should be made in….
IP addressing IP delivery semantics IP quality of service IP security IP routing IP fragmentation IP error detection
IPv6 Changes
Addresses are 128bit Simplification
Removes checksum Eliminates fragmentation
Parsing an IPv6 address
Specified as 8, 2-byte (4 hex digit) numbers
2610:10:20:220:45c7:8fb6:7430:bcb6
Note, leading 0s omitted for brevity
Double-colon notation
Can be used exactly once in an address to specify a
wildcard of all 0s in address
Fills address with enough nulls to create a 128-bit
address
Example catron.cs.pdx.edu
2610:10:20:1103::22 2610:10:20:1103:0:0:0:22
Example
catron <~> 7:03PM % ifconfig eno1 Link encap:Ethernet HWaddr 98:90:96:d8:56:e7 inet addr:10.218.103.22 Bcast:10.218.103.255 Mask:255.255.255.0 inet6 addr: fe80::9a90:96ff:fed8:56e7/64 Scope:Link inet6 addr: 2610:10:20:1103::22/128 Scope:Global catron <~> 7:37PM % dig -t AAAA meson.cs.pdx.edu … ;; ANSWER SECTION: meson.cs.pdx.edu. 6901 IN AAAA 2610:10:20:1103::21 catron <~> 7:03PM % ping6 2610:10:20:1103::21 PING 2610:10:20:1103::21(2610:10:20:1103::21) 56 data bytes 64 bytes from 2610:10:20:1103::21: icmp_seq=1 ttl=64 time=0.328 ms ^C
- -- 2610:10:20:1103::21 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.328/0.328/0.328/0.000 ms
Example
How can you ssh into Particle lab machines from
external locations?
Use their IPv6 addresses
ssh -6 catron.cs.pdx.edu
Changes
Multicast in IPv6 supported
Reserved IPv6 address space for multicast
FF00::/8
Explicit scopes
Link-local (for broadcast on LAN) Site/Organization-local (for flooding of link states) Global (not typically used)
Anycast through multiple interfaces using the same
unicast address (work-in-progress)
Transition From IPv4 To IPv6
Eventually, run dual stacks (PSU)
1-to-1 mapping of current IPv4 to IPv6 address space
(Penguin Linuxlab machines)
What happens when you run into a cloud of IPv4-only
routers?
Address translation (rare) Tunneling
IPv6 carried as payload in an IPv4 datagram among IPv4 routers Treats the entire IPv4 network as a single data-link!
e.g. IPv4 is a framing protocol for the IPv4 "data-link" layer
Builds a virtual IPv6 network link on top of an IPv4 network
Tunneling
4-68
A B E F
IPv6 IPv6 IPv6 IPv6 tunnel
Logical view: Physical view: A B E F
IPv6 IPv6 IPv6 IPv6
C D
IPv4 IPv4
Flow: X Src: A Dest: F data Flow: X Src: A Dest: F data Flow: X Src: A Dest: F data
Src:B Dest: E
Flow: X Src: A Dest: F data
Src:B Dest: E
A-to-B: IPv6 E-to-F: IPv6 B-to-C: IPv6 inside IPv4 B-to-C: IPv6 inside IPv4
Turns IPv4 network into virtual link
Network virtualization
Virtualization of networks
Virtualization of resources: a powerful abstraction in CS
Virtual memory addresses Virtual machines (IBM VM os from 1960’s/70’s)
Virtual network interfaces
Virtual network interfaces
mashimaro <~> 2:15PM % sudo ifconfig eth0:1 up 131.252.220.64 netmask 255.255.255.0 mashimaro <~> 2:16PM % sudo ifconfig eth0:2 up 131.252.220.65 netmask 255.255.255.0 mashimaro <~> 2:16PM % ifconfig -a eth0 Link encap:Ethernet HWaddr 34:17:eb:a5:23:f7 inet addr:131.252.220.66 Bcast:131.252.220.255 Mask:255.255.255.0 eth0:1 Link encap:Ethernet HWaddr 34:17:eb:a5:23:f7 inet addr:131.252.220.64 Bcast:131.252.220.255 Mask:255.255.255.0 eth0:2 Link encap:Ethernet HWaddr 34:17:eb:a5:23:f7 inet addr:131.252.220.65 Bcast:131.252.220.255 Mask:255.255.255.0
"A Protocol for Packet Network Intercommunication",
- V. Cerf, R. Kahn, IEEE Transactions on Communications,
May, 1974, pp. 637-648.
The Internet: the first virtual network
1974: multiple unconnected nets
ARPAnet packet satellite network (Aloha) packet radio network
… differing in:
addressing conventions packet formats error recovery routing
ARPAnet satellite net
The Internet: virtualizing networks
Internetwork layer (IP) creates a virtual network that
appears as a single uniform entity(despite underlying heterogeneity)
Gateway embeds locally formatted packets into
internetwork packets
Routes them (at internetwork level) to next gateway ARPAnet satellite net gateway
Cerf & Kahn’s Internetwork Architecture
Two layers of addressing: internetwork and local network
New layer (IP) makes everything homogeneous at
internetwork layer
Underlying local network technology now invisible at the
internetwork layer
cable satellite 56K telephone modem ATM, MPLS Just another link layer technology to IP!
Virtualizing on top of the Internet
Virtual LAN via L2 tunnel (one form of a Virtual Private
Network or VPN)
Emulate a LAN/network over the Internet Place branch office on same network as corporate Frames tunneled over the Internet
- 1. LAN frame
to B
A B
- 2. Encrypted and encapsulated
in IP packet from V1 to V2
V1 V2
- 3. Decrypted and decapsulated
to get original frame
- 4. LAN frame
to B
- 5. A now appears to be on same LAN as B (responses treated similarly)
Example: Virtual LAN via L2 tunnel
Host can also setup connection to remote VPN server Consider home network with client at 192.168.0.5 Work network at 131.252.220.0/24
VPN server at 131.252.220.1
Authenticates remote client via username/password Assigns remote client an IP address on LAN (131.252.220.55) Responds to ARPs for 131.252.220.55 on behalf of client Decapsulates and encapsulates packets to/from client
File server at 131.252.220.66 that only allows access
from machines on the same network
IP Dst = 131.252.220.1 IP Src = 192.168.0.5 IP Src = 131.252.220.55 IP Dst = 131.252.220.66 IP Src = 131.252.220.55 IP Src = 131.252.220.66
VPN server terminates tunnel, sends frame onto network L2 (Home) L2 (Work) L2 (Work)
Other options for building VPNs
Encapsulating and tunnelling packets via
Layer-2 (previous example)
PPTP (Point-to-point Tunneling Protocol) L2F (Layer 2 Forwarding) L2TP (Layer 2 Tunneling Protocol)
Layer 3
Generic Routing Encapsulation (GRE) (IP in IP) IPsec tunnels (Encrypted IP in IP)
Encrypt at a layer below network layer
Example: Virtual LAN via IP in IP
IP in IP using Generic Routing Encapsulation (more
common)
Example
Virtual Private Cloud
Take network of resources from cloud provider and bring it
- nto local network
Virtualized link
- ver public IP
network between Customer and AWS
Software-Defined Networks (SDN)
Problems with Internet routing
Distributed routing algorithms hard to make predictable
Stability poor (route-flapping common) Route convergence slow
Expensive to manage at scale
Routers, switches, firewalls, NAT, load balancers with
disparate interfaces
Cisco, Juniper with whole certification programs
Complex, custom control software (not interoperable) Human-intensive task for managing complexity
Problems with Internet routing
Opacity of operation
Inability to reason about behavior of protocols and
algorithms
Inflexible
Inability to support multiple routing policies other than hop
count
Control-plane (routing) and data-plane (forwarding) tightly
coupled
Proprietary (pre-2005)
At the mercy of a small number of vendor-supported
features and proprietary platforms (Cisco, Juniper)
Addressing issues
Active networks (1990s)
Programmable packet handling (e.g. drop, flood, forward, modify
header, send to slow path)
Separation of control plane from data plane in networks (2000s)
Control plane => traffic policy Data plane => forward traffic based on policy control plane makes
Standardized network device OS (2000s) OpenFlow (2008)
Open networking stack for constructing routers that are highly
programmable
OpenFlow API as middleware layer to standardized access to network
devices (data plane)
Form the basis for "software-defined networks“ (SDNs)
Use commodity hardware with standard programmable interfaces to
build networks/routers
Key for enabling cloud computing
SDNs (2009)
Standard uniform interface for network device
programmability (e.g. "IP", but for router configuration)
Alleviate difficulty in debugging configurations Enables network device orchestration
Separation of control-plane and data-plane
Central controller performs scheduling and route configuration
then pushes into the network
Allows single software control program to control all data-plane
elements in the network
Flexible routing policies
Replace dynamic routing based on hop count with other metrics
to allow for better predictability and control over routes
Support more than destination IP-based routes (e.g. base
decisions on source IP or TCP/UDP ports)
Programmable handling of packets
Support multiple actions (e.g. drop, flood, forward, modify
header, send to controller)
SDN applications
Traffic engineering
Control the paths used to deliver traffic
Shift traffic during the day via a centralized schedule to maximize
resource use
Allows one to ease over-provisioning networks since one can control
load tightly
Links can be run close to 100%!
Configure alternative routes during planned changes
Perform multiple path routing at high load
Send bulk transit traffic to alternate slower paths compared to
customer traffic
Send video traffic to one peer, non-video to another peer for transit
based on delay and price
SDN applications
Virtualization
Custom topologies (virtual networks) managed
programmatically
Example: Virtual LANs
CAT switches throughout college supporting dozens of virtual LANs Can turn on any port and assign it to an emulated LAN
e.g. ports in FAB 145 and FAB 120-14 on same 220 VLAN
Often done manually
Load balancing
Ability to support anycasting and global scale HTTP load
balancing
DoS evasion
Programmatically drop attack traffic
SDN applications
Cloud computing
Multiple tenants sharing same underlying network and
interfaces but on separate virtual topologies
Allocate different bandwidth slices to virtual topologies
based on service level
Allows machines in disparate locations to be on same
“virtual” network via a click requiring someone to run around configuring it
Deployment on Google Cloud
Late 2000s
Google building out large network Needs
Network-wide visibility (difficult to glean from proprietary devices) Centralized control over data plane
Has knowledge of global demand Wants to avoid unpredictability and convergence delays of routing
protocols Issues
Unsustainable cost in capital and operating expenses for what it
wants to do
Inability to get support from Cisco/Juniper for features and control
required Decides to build its own router on commodity hardware
using OpenFlow
Deployment on Google Cloud
B4
Designed in 2008, deployed in 2010
http://cseweb.ucsd.edu/~vahdat/papers/b4-sigcomm13.pdf
Reduced cost via use of commodity hardware Control of platform via open-source router software Designed for homogeneity in Google's data centers
(purpose-built infrastructure to achieve simplicity)
Big red button to fall back to shortest-path routing
Now called Andromeda
https://cloudplatform.googleblog.com/2017/11/Andromeda-
2-1-reduces-GCPs-intra-zone-latency-by-40- percent.html?m=1
Used everywhere in GCP to programmatically reconfigure
networks for users
SDN issues
Logically centralized route control and management
Breaks fate-sharing
Control and data planes do not share same fate Independent failures of the brain and body results in bizarre faiure
patterns that are hard to recover from Breaks distributed control philosophy of Internet
Centralized SDN controller that may not be fault-tolerant (needs
redundancy)
What if network partition happens? Hard problem in distributed systems design
Attacks on the "brain", state-manipulation attacks
Compromised hosts, switches, and routers sending bogus join and
leave events
https://www.usenix.org/conference/usenixsecurity17/technical-
sessions/presentation/xu-lei
Identity hijacking attacks (ARP, DHCP, IP, DNS, TCP, and BGP
spoofing)
https://www.usenix.org/conference/usenixsecurity17/technical-
sessions/presentation/jero
SDN and IT
Shift to hiring programmers to write programs to
control networks of commodity routers/switches
People with mastery of CS concepts needed versus
network operations engineers and manual configuration
Moves away from proprietary network hardware/software
and certifications
ICMP
ICMP: Internet Control Message Protocol
Protocol for passing
control messages
error reporting:
unreachable host, network, port, protocol
echo request/reply (used
by ping)
http://www.rfc-
editor.org/rfc/rfc792.txt
Type Code description 0 0 echo reply (ping) 3 0 dest. network unreachable 3 1 dest host unreachable 3 2 dest protocol unreachable 3 3 dest port unreachable 3 6 dest network unknown 3 7 dest host unknown 4 0 source quench (congestion control - not used) 8 0 echo request (ping) 9 0 route advertisement 10 0 router discovery 11 0 TTL expired 12 0 bad IP header
Used to implement traceroute
What do “real” Internet delay & loss look like? traceroute
Measures delay from source to router along end-end
Internet path towards destination.
3 probes 3 probes 3 probes
traceroute algorithm
Source sends series of UDP/IP packets to dest
First has TTL =1 Second has TTL=2, etc.
When nth datagram arrives to nth router:
Router discards datagram And sends to source an ICMP message (type 11, code 0) Message includes name of router and IP address
When ICMP message arrives, source calculates RTT Traceroute does this 3 times per TTL value Stopping criterion
UDP segment eventually arrives at destination host Destination returns ICMP “host unreachable” packet (type
3, code 3)
Try it
Some routers labeled with airport code of city or region
they are located
Note: Northwest Access Exchange peering points
198.32.195.0/24 (nwax) https://www.nwax.net/Members
Lookup the IP addresses of oregonlive.com
Use either nslookup or dig (address in ANSWER section)
Perform a traceroute <IP-address> to both to
discover where the site is currently hosted.
Labs
Network Lab #1 (Netsim)
Create an account and complete all levels of Netsim
https://netsim.erinn.io Show screenshot of completed list of levels For Level #5
Show packet before it hits modem Show packet after it leaves modem
For the ping and traceroute levels, ensure ICMP is
capitalized when specifying the proto field
Network Lab #2 (IPv6)
ping/ping6
Find your favorite machine in the Particle lab
https://cat.pdx.edu/labstatus/labs/cslinlabb/
Find its IPv4 and IPv6 address by ssh'ing into it and performing
an ifconfig
From your local Penguin machine, use both ping and ping6 to
ping its IPv4 and IPv6 addresses
traceroute/traceroute6
Perform a traceroute to 1.1.1.1
What is the name that traceroute returns for this IP address? With that name, perform the following and examine the "ANSWER"
section to find the IPv6 addresses associated with the name
dig -t AAAA <name>
Then, perform a traceroute6 to one of its IPv6 addresses Does the traceroute end up at the same place?
Perform a traceroute to up.edu and facebook.com
Do the packets stay in Oregon?
Network Lab #3 (nmap)
This lab will give you experience with Google's Compute Engine
and its offerings in Cloud Launcher as well with nmap, a standard tool for performing network security auditsewf
Launch a Ubuntu 16.04 instance on Compute Engine using the
default machine type (3.75 GB of memory)
Then
sudo apt-get update sudo apt-get install nmap
Go to Google Cloud Launcher
Filter on Virtual Machines Then select Blog & CMS as the Category Bring up 3 solutions with the following settings
Zone: us-west1-b Machine type: micro Deselect “Allow HTTPS traffic” Show the landing page for each VM to ensure it has been deployed properly Note the “Internal IP address” of each instance
Run nmap on the internal subnet the instances have been
placed on
nmap w.x.y.z/24 Show output for the scan
Shutdown the instances
Network Lab #4: Subnets in the cloud
Link to lab at end of walkthrough In Cloud Shell, set region/zone to local Google datacenter
Note: you will be creating sub-networks in other zones and regions,
so for this lab only, use the lab's zones/regions verbatim Skip Step 5 (Legacy), go to Step 6 to list networks and
create instances
$ gcloud config set compute/zone us-west1-b $ gcloud config set compute/region us-west1
$ gcloud compute networks list $ gcloud compute networks subnets list $ gcloud compute instances create instance-1 --zone us-west1-b $ gcloud compute instances list
Network Lab #4
Create a custom network spanning your regions, then
create subnetworks within it
$ gcloud compute networks create custom-network1 --subnet-mode custom $ gcloud compute networks subnets create subnet-us-central-192 \
- -network custom-network1 \
- -region us-central1 \
- -range 192.168.1.0/24
$ gcloud compute networks subnets create subnet-europe-west-192 \
- -network custom-network1 \
- -region europe-west1 \
- -range 192.168.5.0/24
$ gcloud compute networks subnets listr
Network Lab #4
Create instances on each of the subnetworks
Note: Machines in different subnetworks are not able to
communicate by default for security purposes
Network filtering rules can be defined to explicitly enable this A lab will cover this later…
$ gcloud compute instances create instance-3 \
- -zone us-central1-a \
- -subnet subnet-us-central-192
$ gcloud compute instances create instance-4 \
- -zone europe-west1-d \
- -subnet subnet-europe-west-192
Network Lab #4
https://codelabs.developers.google.com/codelabs/clou
d-subnetworks
Extra
NAT
Implementation: NAT router must:
outgoing datagrams: replace (source IP address, port #)
- f every outgoing datagram to (NAT IP address, new port
#) . . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr.
remember (in NAT translation table) every (source IP
address, port #) to (NAT IP address, new port #) translation pair
incoming datagrams: replace (NAT IP address, new port
#) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table
Normal FTP mode
Server has port 20, 21 reserved Client initiates control connection to port 21 on server Client allocates port X for data connection Client passes its IP address and the data connection port
(X) in a PORT command to server
Server parses PORT command and initiates connection
from its own port 20 to the client on port X
What if client is behind a NAT device?
ftp, NAT and PORT command
ftp, NAT and PORT command
Problem
ftp server connects to a private IP address!
192.168.0.1 192.168.0.2 Packet #1 SrcIP=192.168.0.1 SrcPort=1312 DstIP=131.252.220.66 DstPort=21
- PORT command
“Connect to me at IP=192.168.0.1 Port=20” NAPT translator ExternalIP=129.95.50.3 Packet #1 after NAPT SrcIP=129.95.50.3 SrcPort=2000 DstIP=131.252.220.66 DstPort=21
- PORT command
“Connect to me at IP=192.168.0.1 Port=20”
Solution #1
Modify packets at NAT
NAT must captures outgoing connections destined for port 21 Looks for PORT command and translates address/port payload
http://www.practicallynetworked.com/support/linksys_ftp_port.htm
What if NAT doesn’t parse PORT command correctly? What if ftp server is running on a different port than 21?
ftp, NAT and PORT command
ftp, NAT and PORT command
Need to rewrite points to bigger problem!
Loss of network transparency Network must modify application data in order for application
to run correctly!
192.168.0.1 192.168.0.2 Packet #1 SrcIP=192.168.0.1 SrcPort=1312 DstIP=131.252.220.66 DstPort=21
- PORT command
“Connect to me at IP=192.168.0.1 Port=20” NAPT translator ExternalIP=129.95.50.3 Packet #1 after NAPT SrcIP=129.95.50.3 SrcPort=2000 DstIP=131.252.220.66 DstPort=21
- PORT command
“Connect to me at IP=129.95.50.3 Port=2001”
Solution #2
Passive (PASV) mode
Client initiates control connection to port 21 on server Client enables “Passive” mode Server responds with PORT command giving client the IP address
and port to use for subsequent data connection (usually port 20, but can be bypassed)
Client initiates data connection by connecting to specified port on
server Most web browsers do PASV-mode ftp
ftp, NAT, and PORT command
PASV mode transfers
ftp, NAT, and PORT command
192.168.0.1 192.168.0.2 NAPT translator ExternalIP=129.95.50.3 After PASV command SrcIP=131.252.220.66 SrcPort=21 DstIP=129.95.50.3 DstPort=2000
- PORT command
“Connect to me at IP=131.252.220.66 Port=20”
ftp, NAT, and PORT command
Solution #2
What if server is behind a NAT device?
See client issues
What if both client and server are behind NAT devices?
Problem Similar to P2P xfers and Skype
See IETF STUN WG