Mul Multitena nancy ncy for r Fast and nd Programmabl ble - - PowerPoint PPT Presentation

mul multitena nancy ncy for r fast and nd programmabl ble
SMART_READER_LITE
LIVE PREVIEW

Mul Multitena nancy ncy for r Fast and nd Programmabl ble - - PowerPoint PPT Presentation

Mul Multitena nancy ncy for r Fast and nd Programmabl ble Network rks in n the he Cl Cloud ud Tao Wang * , Hang Zhu * , Fabian Ruffy, Xin Jin, Anirudh Sivaraman, Dan Ports, and Aurojit Panda ( * Equal contribution) Wha What do does


slide-1
SLIDE 1

Mul Multitena nancy ncy for r Fast and nd Programmabl ble Network rks in n the he Cl Cloud ud

Tao Wang*, Hang Zhu*, Fabian Ruffy, Xin Jin, Anirudh Sivaraman, Dan Ports, and Aurojit Panda (*Equal contribution)

slide-2
SLIDE 2

Wha What do does s toda day’s s cloud ud offer as s a se service?

ØGeneric compute and storage resources ØSpecialized accelerators

2

slide-3
SLIDE 3

Em Emergenc nce of f pr progr grammabl ble ne network k de devices

ØPipeline-based programmable devices

ØIn-network switches ØAt-host SmartNICs

ØEnable wide-range innovations for classical networked systems

ØConsensus: NOPaxos, NetPaxos ØConcurrency control: Eris ØCaching: NetCache, IncBricks ØStorage: NetChain, SwitchKV ØApplications: SwitchML, NetAccel Ø…

3

slide-4
SLIDE 4

Wh Why y no not offer suc such h system as s a cloud ud se service?

ØNeed of multitenancy support ØProvider’s aspect

ØImprove resource utilization

ØOne application can hardly consume all the hardware resources ØHeterogenous resource requirement

ØTenant’s aspect

ØEnable innovations

ØNew programs can be easily tested w/o impacting basic network functionality

4

slide-5
SLIDE 5

Our vision: a hybrid compile-time and run-time solution

Requirements: ØResource efficiency ØLittle overhead ØIsolation ØPerformance ØAllocated resource

How to enable multitenancy y for programmable devices?

5

slide-6
SLIDE 6

Parser … Match Action Match Action Stage 1 Ingress Pipeline

……

Egress Pipeline … Queues Stateful Mem Circuit … Ethernet header … Packet Headers

Queue length Hardware enqueue port Per-packet Metadata

Exact match Xbar Ternary match Xbar SRAMs/TCAMs PHV container e.g., register Action units

6

Ba Backgrou

  • und on
  • n prog
  • gramma

mmable network

  • rk devices
slide-7
SLIDE 7

Performance Programmability

Pr Programmable devices’ characte teristics

ØVarious types of hardware resources

ØMost of them are decided during compile time

ØLimited run-time support

ØHardware wirings are decided during compile time

ØLine-rate performance achieved after successful compilation

ØNo temporal scheduling (e.g., CPU or NPU scheduling) ØNo spatial reconfiguration (e.g., FPGA [AmorphOS, OSDI’18])

ØResource efficiency ØLittle overhead ØIsolation ØPerformance ØAllocated resource

7

slide-8
SLIDE 8

A A hybrid compile-tim time e an and run-tim time e solu lutio tion

ØCompile-time program linker

ØTarget generic resources (e.g., SRAMs/TCAMs, action units, etc.) ØBut static

ØRun-time memory allocator

ØTarget stateful memory ØBut limited

8

slide-9
SLIDE 9

Sy System overview

Resource Sharing Policy Resource Usage Checker Program Linker Merged Jumbo Program S T1 Tn

Tenants Translation Layer

S u b m i t r e q u e s t

Data Plane Control Plane Header & Metadata Stage 1 Stage 2 Stage 3 Stage m … Table Entry Handler Run-time Memory Allocator Utility Calculator Reallocation Problem Solver

Config Params

One Big Array

Sys & Tenant Tables

One Big Array

Sys & Tenant Tables

One Big Array

Counter Record

One Big Array

1 2 3 Compile-time Linker

slide-10
SLIDE 10

Go Goals als of compile ile-tim time e lin linker er

ØRestrict resource usage ØProvide isolation

ØEnsure tenant program does not inference with others’ ØEnsure no infinite packet resubmitting ØEnsure no loop forwarding configuration Ø…

10

slide-11
SLIDE 11

Pa Parser

ØFixed packet format

ØEth, VLAN, IP, TCP or UDP header followed by custom headers

ØSystem program

ØExtract common headers

ØTenant Programs

ØExtract tenant-defined headers Parser

Header { Ethernet hdr IP hdr VLAN hdr TCP or UDP hdr T1 hdr … Tn hdr } apply S’s parser to extract common headers System Program if (tag==T1’s VID) apply T1’s parser … Tenant Programs

11

slide-12
SLIDE 12

Con Control

  • l (ingress and egress) pipeline

ØFeed-forward packet flow

Ø“Sandwich” architecture

Øwrite-then-read half Øread-then-write half

ØSystem program

ØInteract with tenant programs ØE.g., pass system states ØConvert virtual addresses to physical

  • nes

Control Pipeline

System states { … link utilization packet count … } Pass system states to tenants if (tag==T1’s VID) apply T1’s ctrl … Convert to system states System states { egress_port … }

Packet Flow

12

slide-13
SLIDE 13

Config Params One Big Array One Big Array Counter Record One Big Array Memory allocator Control Plane

Ru Run-tim time e mem emory allo allocator

ØPage-table-like indirection

Match Action VID==1 metadata.offset=0 metadata.amount=26 VID==2 metadata.offset=512 metadata.amount=24 … … pkt.physical_address = metadata.offset + (pkt.virtual_address % metadata.amount) Register Array Tenant 1 Tenant 2

13

slide-14
SLIDE 14

Im Implem plemen entatio tion

ØPrototype on Barefoot Tofino switch ØCompile-time linker

ØExtend open-source P4 compiler[1]

ØRun-time memory allocator

ØBase on auto-generated APIs to pull records and modify table entries

[1] https://github.com/p4lang/p4c

14

slide-15
SLIDE 15

Comp Compile-tim time e program am lin linker er correc ectn tnes ess

ØResource usage on Tofino ØPacket-level validation on PTF

ØSys program

ØBasic parsing and forwarding logics

Ø[SOSP’17] NetCache Ø[NSDI’18] NetChain

ØOverhead

ØAdditional gateway tables to check which program to be executed ØAdditional tag-along PHV containers

50 100 150

E x a c t M a t c h X b a r S R A M H a s h B i t s U n i t A c t i

  • n

U n i t s # S t a g e s G a t e w a y T a b l e s P H V Resource Usage (% of total) Merged program Sys program NetCache NetChain

15

slide-16
SLIDE 16

Ru Run-tim time e mem emory allo allocator effic icien iency

ØExperimental Setting

Ø64 tenants submit 1-min heavy hitter detection task against source IP address within its /6 subnets Ø10-min CAIDA trace replay

ØEvaluation metric

ØUtility: memory hit ratio ØSatisfaction: time fraction w/ utility > 0.9 ØWe show the mean and 5th percentile

16

slide-17
SLIDE 17

Con Conclusion

  • n

ØTakeaways

ØA hybrid solution for multi-tenancy support ØCompile-time linker: general but static ØRun-time memory allocator: dynamic but limited

ØFuture work

ØSeek new hardware design

ØBoth general and dynamic

17

slide-18
SLIDE 18

Thanks!

Happy to take questions tw1921@nyu.edu