RouteBricks: Exploiting Parallelism To Scale Software Routers - - PowerPoint PPT Presentation

routebricks
SMART_READER_LITE
LIVE PREVIEW

RouteBricks: Exploiting Parallelism To Scale Software Routers - - PowerPoint PPT Presentation

RouteBricks: Exploiting Parallelism To Scale Software Routers Mihai Dobrescu & Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, Sylvia Ratnasamy EPFL, Intel Labs Berkeley,


slide-1
SLIDE 1

RouteBricks:

EPFL, Intel Labs Berkeley, Lancaster University

Exploiting Parallelism To Scale Software Routers

Mihai Dobrescu & Norbert Egi, Katerina Argyraki, Byung-Gon Chun, Kevin Fall, Gianluca Iannaccone, Allan Knies, Maziar Manesh, Sylvia Ratnasamy

slide-2
SLIDE 2

Building routers

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Fast
  • Programmable

» custom statistics » filtering » packet transformation » …

2

slide-3
SLIDE 3

Why programmable routers

Katerina Argyraki, SOSP, Oct. 12, 2009

  • New ISP services

» intrusion detection, application acceleration

  • Simpler network monitoring

» measure link latency, track down traffic

  • New protocols

» IP traceback, Trajectory Sampling, …

3

Enable flexible, extensible networks

slide-4
SLIDE 4

Today: fast or programmable

  • Fast “hardware” routers

» throughput : Tbps » no programmability

  • Programmable “software” routers

» processing by general-purpose CPUs » throughput < 10Gbps

Katerina Argyraki, SOSP, Oct. 12, 2009 4

slide-5
SLIDE 5

RouteBricks

Katerina Argyraki, SOSP, Oct. 12, 2009

  • A router out of off-the-shelf PCs

» familiar programming environment » large-volume manufacturing

  • Can we build a Tbps router out of PCs?

5

slide-6
SLIDE 6

packet processing + switching

Router =

Katerina Argyraki, SOSP, Oct. 12, 2009

  • N: number of external router ports
  • R: external line rate

R R R R R R R R

6

N

slide-7
SLIDE 7

N R R

A hardware router

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Processing at rate ~R per linecard

linecards linecards

7

slide-8
SLIDE 8

A hardware router

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Processing at rate ~R per linecard
  • Switching at rate N x R by switch fabric

switch fabric N R R linecards linecards

8

slide-9
SLIDE 9

commodity interconnect

RouteBricks

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R

  • Processing at rate ~R per server
  • Switching at rate ~R per server

servers servers

9

slide-10
SLIDE 10

commodity interconnect

RouteBricks

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R servers servers

Per-server processing rate: c x R

10

slide-11
SLIDE 11

Outline

  • Interconnect
  • Server optimizations
  • Performance
  • Conclusions

Katerina Argyraki, SOSP, Oct. 12, 2009 11

slide-12
SLIDE 12

Outline

  • Interconnect
  • Server optimizations
  • Performance
  • Conclusions

Katerina Argyraki, SOSP, Oct. 12, 2009 12

slide-13
SLIDE 13

commodity interconnect

Requirements

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R

  • Internal link rates < R
  • Per-server processing rate: c x R
  • Per-server fanout: constant

13

slide-14
SLIDE 14

A naive solution

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R R

14

slide-15
SLIDE 15

A naive solution

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R R

15

  • N external links of capacity R
  • N2 internal links of capacity R
slide-16
SLIDE 16

Valiant load balancing

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R R/N R/N

16

slide-17
SLIDE 17

Valiant load balancing

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R

17

  • N external links of capacity R
  • N2 internal links of capacity R

R/N R/N

2R/N

slide-18
SLIDE 18

Valiant load balancing

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R R/N R/N

  • Per-server processing rate: 3R
  • Uniform traffic: 2R

18

slide-19
SLIDE 19

Per-server fanout?

Katerina Argyraki, SOSP, Oct. 12, 2009

N R

19

slide-20
SLIDE 20

Per-server fanout?

Katerina Argyraki, SOSP, Oct. 12, 2009

N R

  • Increase server capacity

20

slide-21
SLIDE 21

Per-server fanout?

Katerina Argyraki, SOSP, Oct. 12, 2009

N R

  • Increase server capacity

21

slide-22
SLIDE 22

Per-server fanout?

Katerina Argyraki, SOSP, Oct. 12, 2009

N R

  • Increase server capacity
  • Add intermediate nodes

» k-degree n-stage butterfly

22

slide-23
SLIDE 23

Our solution: combination

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Assign max external ports per server
  • Full mesh, if possible
  • Extra servers, otherwise

23

slide-24
SLIDE 24

Example

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Assuming current servers

» 5 NICs, 2 x 10G ports or 8 x 1G ports » 1 external port per server

  • N = 32 ports: full mesh

» 32 servers

  • N = 1024 ports: 16-ary 4-fly

» 2 extra servers per port

24

slide-25
SLIDE 25

Valiant load balancing + full mesh k-ary n-fly

Recap

Katerina Argyraki, SOSP, Oct. 12, 2009

N R R

Per-server processing rate: 2R – 3R

25

slide-26
SLIDE 26

Outline

  • Interconnect
  • Server optimizations
  • Performance
  • Conclusions

Katerina Argyraki, SOSP, Oct. 12, 2009 26

slide-27
SLIDE 27

Setup: NUMA architecture

Katerina Argyraki, SOSP, Oct. 12, 2009

I/O hub Mem Cores Mem

» Nehalem architecture, QuickPath interconnect » CPUs: 2 x [2.8GHz, 4 cores, 8MB L3 cache] » NICs: 2 x Intel XFSR 2x10Gbps » kernel-mode Click

Ports

min-size packets

27

slide-28
SLIDE 28

Single-server performance

Katerina Argyraki, SOSP, Oct. 12, 2009

I/O hub Mem Cores Mem Ports

28

  • First try: 1.3 Gbps
slide-29
SLIDE 29

Problem #1: book-keeping

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Managing packet descriptors

» moving between NIC and memory » updating descriptor rings

  • Solution: batch packet operations

» NIC batches multiple packet descriptors » CPU polls for multiple packets

29

slide-30
SLIDE 30

Single-server performance

Katerina Argyraki, SOSP, Oct. 12, 2009

I/O hub Mem Cores Mem Ports

30

  • First try: 1.3 Gbps
  • With batching: 3 Gbps
slide-31
SLIDE 31

Problem #2: queue access

Katerina Argyraki, SOSP, Oct. 12, 2009

Cores Ports

31

slide-32
SLIDE 32

Problem #2: queue access

Katerina Argyraki, SOSP, Oct. 12, 2009 32

  • Rule #1: 1 core per port
slide-33
SLIDE 33

Problem #2: queue access

Katerina Argyraki, SOSP, Oct. 12, 2009 33

  • Rule #1: 1 core per port
  • Rule #2: 1 core per packet
slide-34
SLIDE 34

Problem #2: queue access

Katerina Argyraki, SOSP, Oct. 12, 2009 34

  • Rule #1: 1 core per port
  • Rule #2: 1 core per packet
slide-35
SLIDE 35

Problem #2: queue access

Katerina Argyraki, SOSP, Oct. 12, 2009 35

  • Rule #1: 1 core per port
  • Rule #2: 1 core per packet
slide-36
SLIDE 36

Problem #2: queue access

Katerina Argyraki, SOSP, Oct. 12, 2009 36

  • Rule #1: 1 core per port
  • Rule #2: 1 core per packet

queue

slide-37
SLIDE 37

Single-server performance

Katerina Argyraki, SOSP, Oct. 12, 2009

I/O hub Mem Cores Mem Ports

37

  • First try: 1.3 Gbps
  • With batching: 3 Gbps
  • With multiple queues: 9.7 Gbps
slide-38
SLIDE 38

Recap

Katerina Argyraki, SOSP, Oct. 12, 2009

  • State-of-the art hardware

» NUMA architecture, multi-queue NICs

  • Modified NIC driver

» batching

  • Careful queue-to-core allocation

» one core per queue, per packet

38

slide-39
SLIDE 39

Outline

  • Interconnect
  • Server optimizations
  • Performance
  • Conclusions

Katerina Argyraki, SOSP, Oct. 12, 2009 39

slide-40
SLIDE 40

Single-server performance

IP routing No-op forwarding 24.6

Katerina Argyraki, SOSP, Oct. 12, 2009

6.35 24.6 9.7 Gbps

  • Realistic size mix: R = 8 – 12 Gbps
  • Min-size packets: R = 2 – 3 Gbps

40

Min-size packets Realistic size mix

slide-41
SLIDE 41

Bottlenecks

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Realistic size mix: I/O
  • Min-size packets: CPU

41

No-op forwarding 24.6 6.35 24.6 9.7 Gbps IP routing

Min-size packets Realistic size mix

slide-42
SLIDE 42

With upcoming servers

70

Katerina Argyraki, SOSP, Oct. 12, 2009

25.4 70 38.8 IP routing Gbps

  • Realistic size mix: R = 23 – 35 Gbps
  • Min-size packets: R = 8.5 – 12.7 Gbps

No-op forwarding

42

Min-size packets Realistic size mix

slide-43
SLIDE 43

RB4 prototype

Katerina Argyraki, SOSP, Oct. 12, 2009

  • N = 4 external ports

» 1 server per port » full mesh

  • Realistic size mix: 4 x 8.75 = 35 Gbps

» expected R = 8 – 12 Gbps

  • Min-size packets: 4 x 3 = 12 Gbps

» expected R = 2 – 3 Gbps

43

slide-44
SLIDE 44

I did not talk about

Katerina Argyraki, SOSP, Oct. 12, 2009

  • Reordering

» avoid per-flow reordering » 0.15%

  • Latency

» 24 microseconds per server (estimate)

  • Open issues

» power, form-factor, programming model

44

slide-45
SLIDE 45

Conclusions

Katerina Argyraki, SOSP, Oct. 12, 2009

  • RouteBricks: high-end software router

» Valiant LB cluster of commodity servers

  • Programmable with Click
  • Performance:

» easily R = 1Gbps, N = 100s » R = 10Gbps for realistic traffic » for worst case, with upcoming servers

45

slide-46
SLIDE 46

Thank you.

Katerina Argyraki, SOSP, Oct. 12, 2009

  • NIC driver and more information at

http://routebricks.org

46