The Operating System is the Control Plane Simon Peter , Jialin Li, - - PowerPoint PPT Presentation

the operating system
SMART_READER_LITE
LIVE PREVIEW

The Operating System is the Control Plane Simon Peter , Jialin Li, - - PowerPoint PPT Presentation

Arrakis is: The Operating System is the Control Plane Simon Peter , Jialin Li, Irene Zhang, Timothy Roscoe Dan Ports, Doug Woos, ETH Zurich Arvind Krishnamurthy, Tom Anderson University of Washington Building an OS for the Data Center


slide-1
SLIDE 1

Arrakis is: The Operating System is the Control Plane

Simon Peter, Jialin Li, Irene Zhang, Dan Ports, Doug Woos, Arvind Krishnamurthy, Tom Anderson

University of Washington

Timothy Roscoe

ETH Zurich

slide-2
SLIDE 2

Building an OS for the Data Center

  • Server I/O performance matters
  • Key-value stores, web & file servers, lock managers, …
  • Can we deliver performance close to hardware?
  • Example system: Dell PowerEdge R520

Intel X520 10G NIC Intel RS3 RAID 1GB flash-backed cache Sandy Bridge CPU 6 cores, 2.2 GHz

+ + = $1,200 2 us / 1KB packet 25 us / 1KB write

slide-3
SLIDE 3

Building an OS for the Data Center

  • Server I/O performance matters
  • Key-value stores, web & file servers, lock managers, …
  • Can we deliver performance close to hardware?
  • Example system: Dell PowerEdge R520

Intel X520 10G NIC Intel RS3 RAID 1GB flash-backed cache Sandy Bridge CPU 6 cores, 2.2 GHz

+ + = $1,200

Today’s I/O devices are fast

2 us / 1KB packet 25 us / 1KB write

slide-4
SLIDE 4

Can’t we just use Linux?

slide-5
SLIDE 5

Kernel

Linux I/O Performance

Redis

HW 13%

HW 18%

Kernel 84% Kernel 62%

App 3% App 20%

SET GET

% OF 1KB REQUEST TIME SPENT

API Multiplexing Naming Resource limits Access control I/O Scheduling I/O Processing Copying Protection

Data Path

10G NIC 2 us / 1KB packet RAID Storage 25 us / 1KB write 9 us 163 us

slide-6
SLIDE 6

Kernel

Linux I/O Performance

Redis

HW 13%

HW 18%

Kernel 84% Kernel 62%

App 3% App 20%

SET GET

% OF 1KB REQUEST TIME SPENT

API Multiplexing Naming Resource limits Access control I/O Scheduling I/O Processing Copying Protection

Data Path

10G NIC 2 us / 1KB packet RAID Storage 25 us / 1KB write 9 us 163 us

Kernel mediation is too heavyweight

slide-7
SLIDE 7
  • Skip kernel & deliver I/O directly to applications
  • Reduce OS overhead
  • Keep classical server OS features
  • Process protection
  • Resource limits
  • I/O protocol flexibility
  • Global naming
  • The hardware can help us…

Arrakis Goals

slide-8
SLIDE 8
  • Standard on NIC, emerging on RAID
  • Multiplexing
  • SR-IOV: Virtual PCI devices

w/ own registers, queues, INTs

  • Protection
  • IOMMU:

Devices use app virtual memory

  • Packet filters, logical disks:

Only allow eligible I/O

  • I/O Scheduling
  • NIC rate limiter, packet schedulers

Hardware I/O Virtualization

SR-IOV NIC

Packet filters Network

Rate limiters

User-level VNIC 1 User-level VNIC 2

slide-9
SLIDE 9

Kernel

Naming Resource limits Access control

Redis

How to skip the kernel?

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Copying Protection

Data Path

slide-10
SLIDE 10

Kernel

Naming Resource limits Access control

Redis

How to skip the kernel?

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Copying Protection

Data Path

slide-11
SLIDE 11

Kernel

Naming Resource limits Access control

Redis

How to skip the kernel?

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Copying Protection

Data Path

slide-12
SLIDE 12

Kernel

Naming Resource limits Access control

Redis

How to skip the kernel?

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path

slide-13
SLIDE 13

Kernel

Naming Resource limits Access control

Redis

Arrakis I/O Architecture

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path Control Plane Data Plane

slide-14
SLIDE 14

Kernel

Naming Resource limits Access control

Redis

Arrakis I/O Architecture

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path Control Plane Data Plane

slide-15
SLIDE 15

Kernel

Naming Resource limits Access control

Kernel

Naming Resource limits Access control

Redis

Arrakis I/O Architecture

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path Control Plane Data Plane

slide-16
SLIDE 16

Arrakis Control Plane

  • Access control
  • Do once when configuring data plane
  • Enforced via NIC filters, logical disks
  • Resource limits
  • Program hardware I/O schedulers
  • Global naming
  • Virtual file system still in kernel
  • Storage implementation in applications
slide-17
SLIDE 17

Virtual Storage Area

/tmp/lockfile /var/lib/key_value.db /etc/config.rc …

Kernel VFS Redis

Fast HW ops

Global Naming

Logical disk

slide-18
SLIDE 18

Virtual Storage Area

/tmp/lockfile /var/lib/key_value.db /etc/config.rc …

Kernel VFS emacs Redis

Fast HW ops

Global Naming

Logical disk

slide-19
SLIDE 19

Virtual Storage Area

/tmp/lockfile /var/lib/key_value.db /etc/config.rc …

Kernel VFS emacs

  • pen(“/etc/config.rc”)

Redis

Fast HW ops

Global Naming

Logical disk

slide-20
SLIDE 20

Virtual Storage Area

/tmp/lockfile /var/lib/key_value.db /etc/config.rc …

Kernel VFS emacs Redis

Fast HW ops

Global Naming

Logical disk Indirect IPC interface

slide-21
SLIDE 21

Kernel

Naming Resource limits Access control

Redis

Arrakis I/O Architecture

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path Control Plane Data Plane

slide-22
SLIDE 22

Kernel

Naming Resource limits Access control

Redis

Arrakis I/O Architecture

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path Control Plane Data Plane

slide-23
SLIDE 23

Kernel

Naming Resource limits Access control

Redis

Arrakis I/O Architecture

Redis I/O Devices

API Multiplexing I/O Scheduling I/O Processing Protection

Data Path Control Plane Data Plane Redis

API I/O Processing

slide-24
SLIDE 24

Storage Data Plane: Persistent Data Structures

  • Examples: log, queue
  • Operations immediately persistent on disk

Benefits:

  • In-memory = on-disk layout
  • Eliminates marshaling
  • Metadata in data structure
  • Early allocation
  • Spatial locality
  • Data structure specific caching/prefetching
  • Modified Redis to use persistent log: 109 LOC changed
slide-25
SLIDE 25

Evaluation

slide-26
SLIDE 26

Redis Latency

  • Reduced (in-memory) GET latency by 65%
  • Reduced (persistent) SET latency by 81%

9 us 163 us 4 us 31 us

HW 33% HW 18% libIO 35% Kernel 62% App 32% App 20%

Arrakis Linux

HW 77% HW 13%

libIO 7%

Kernel 84% App 15%

App 3%

Arrakis Linux (ext4)

slide-27
SLIDE 27

Redis Throughput

  • Improved GET throughput by 1.75x
  • Linux: 143k transactions/s
  • Arrakis: 250k transactions/s
  • Improved SET throughput by 9x
  • Linux: 7k transactions/s
  • Arrakis: 63k transactions/s
slide-28
SLIDE 28

memcached Scalability

1.8x 2x 3.1x 200 400 600 800 1000 1200 1 2 4

Throughput (k transactions/s)

Number of CPU cores Linux Arrakis 10Gb/s interface limit

slide-29
SLIDE 29

Single-core Performance

1x 2.3x 3.4x 3.6x 200 400 600 800 1000 1200 Throughput (k packets/s) Linux Arrakis/POSIX Arrakis/Zero-copy Driver 10Gb/s interface limit

UDP echo benchmark

slide-30
SLIDE 30

Summary

  • OS is becoming an I/O bottleneck
  • Globally shared I/O stacks are slow on data path
  • Arrakis: Split OS into control/data plane
  • Direct application I/O on data path
  • Specialized I/O libaries
  • Application-level I/O stacks deliver great performance
  • Redis: up to 9x throughput, 81% speedup
  • Memcached scales linearly to 3x throughput

Source code: http://arrakis.cs.washington.edu