urdma: A Remote Direct Memory Access verbs provider using DPDK - - PowerPoint PPT Presentation

urdma a remote direct memory access verbs provider using
SMART_READER_LITE
LIVE PREVIEW

urdma: A Remote Direct Memory Access verbs provider using DPDK - - PowerPoint PPT Presentation

x urdma: A Remote Direct Memory Access verbs provider using DPDK PATRICK MACARTHUR UNIVERSITY OF NEW HAMPSHIRE SEPTEMBER 6, 2018 Acknowledgements urdma was initially developed during an internship with the IBM Zurich Research Laboratory.


slide-1
SLIDE 1

x

urdma: A Remote Direct Memory Access verbs provider using DPDK

PATRICK MACARTHUR UNIVERSITY OF NEW HAMPSHIRE SEPTEMBER 6, 2018

slide-2
SLIDE 2

2

Acknowledgements

  • urdma was initially developed during an internship with the IBM Zurich

Research Laboratory. The author would like to thank Dr. Bernard Metzler for the opportunity as well as Jonas Pfefferle, Patrick Stuedi, and Animesh Trivedi for their advice and critique on urdma.

  • The author would like to thank Robert Russell and Timothy Carlin for

their advice and critique on this report and the University of New Hampshire InterOperability Laboratory for the use of their RDMA cluster for the development, maintenance, and testing of urdma and UNH EXS.

  • This material is based upon work supported by the National Science

Foundation under Grant No. OCI-1127228 and under the National Science Foundation Graduate Research Fellowship Program under award number DGE-0913620.

slide-3
SLIDE 3

3

Agenda

  • Background
  • Implementation
  • Evaluation
  • Summary
slide-4
SLIDE 4

Background

slide-5
SLIDE 5

5

Background: RDMA (Remote Direct Memory Access)

  • Message-oriented
  • “Zero-copy”: direct transfer between remote application virtual memory regions

with no intermediate data copies (on the hosts)

  • Requires application to pre-register memory
  • Kernel bypass: userspace application has direct access to network adapter
  • Asynchronous: data transfers occur in parallel with application threads, using

OpenFabrics Alliance (OFA) verbs API

  • Transfer operations
  • SEND/RECV
  • RDMA WRITE: push data to remote memory region
  • RDMA READ: pull data from remote memory region
  • Data structures: Queue Pair (QP), Completion Queue (CQ)
  • Standards: InfiniBand, RoCE, iWARP
slide-6
SLIDE 6

6

urdma: Userspace Software RDMA

  • Software emulation of RDMA using DPDK
  • Goals
  • Low latency, high throughput
  • Run on commodity Ethernet NIC
  • Run unmodified verbs applications
  • Perform data transfers in userspace using DPDK
  • Prior work: softiwarp/softroce
  • Perform data transfer in kernel space using kernel sockets
  • Why urdma?
  • Ease of development, easy to use as a development vehicle for new RDMA

features

  • Storage applications; integration with SPDK (Storage Performance Development

Kit)

slide-7
SLIDE 7

Implementation

slide-8
SLIDE 8

8

urdma: Components

Multi-process application

  • urdma_kmod: Loadable kernel

module for RDMA CM support

  • urdmad: DPDK primary process
  • urdma_prov: User verbs provider

library; applications run as DPDK secondary process

App urdma_prov

DPDK

V E R B S

urdma_kmod

Userspace

Kernel space

urdmad

Ethernet NIC

slide-9
SLIDE 9

9

urdma: Protocol

  • Implements iWARP DDP and RDMAP

protocols

  • Runs over UDP transport protocol
  • TRP (Trivial Reliability Protocol) as

thin reliability shim

  • Avoid byte-stream nature and state

machine of TCP RDMAP DDP TRP UDP IP Ethernet RDMAP DDP MPA TCP IP Ethernet Standard iWARP urdma

slide-10
SLIDE 10

10

urdma: Packet Processing

  • urdmad assigns each RDMA queue

pair a hardware receive/transmit Ethernet queue

  • To allow verbs applications to access

the NIC independently

  • Ethernet NIC hardware filters used to

separate packets into RX queues

  • Using Flow Director or ntuple

filtering

  • urdmad forwards all unfiltered packets
  • n each interface to kernel
  • For each established connection,

packets filtered to specific receive queue—received directly by verbs application via urdma_prov

urdma kmod urdmad

NIC

urdma prov verbs app

KNI

slide-11
SLIDE 11

11

urdma initialization issues: rte_eal_init()

  • As a verbs provider library, we want

DPDK to be invisible to the user application

  • We call rte_eal_init() in our own

implementation

  • Our provider code is only run if urdma

kernel module loaded and urdmad master process started

  • Specific issues with rte_eal_init()
  • Takes command-line arguments
  • We construct our own fake argument list
  • Changes CPU affinity of calling thread
  • We create a new thread and call

rte_eal_init() from that thread

  • Tell rte_eal_init() not to create other

lcores

  • All verbs applications must run as the

same user (not necessarily root)

Main thread Master lcore

pthread_create()

User thread 1 User thread n … Verbs calls

slide-12
SLIDE 12

12

urdma_prov: Data Transfer

  • Data transfer done in background progress thread
  • Separates DPDK operations from application threads
  • Allows progress for RDMA READ and RDMA WRITE outside of verbs calls
  • Inter-thread communication done via ring queues

➢ Enqueue one entry at a time ➢ Dequeue entries in bulk

ibv_post_send

Time

Data transfer

QP

CQ ibv_poll_cq

Progress thread User thread

slide-13
SLIDE 13

13

urdma_kmod: Connection Establishment

liburdma urdmad urdma_kmod

QP connected QP Ready to recv RDMA CM Established Event

RDMA Connect app CM Packet Exchange Setup hardware packet filters Time

urdmad must enable receive filter before first packet arrives

(KNI)

slide-14
SLIDE 14

Evaluation

slide-15
SLIDE 15

15

Performance Test Setup

2 pairs of systems with Ubuntu 16.10 with Linux 4.8.0-46-generic kernel, DPDK 16.07.2, PCIe generation 3

  • urdma/softiwarp (Software Implementations)
  • Dual Intel Xeon ES-2630 v4 CPUs @ 2.20GHz
  • 64 GB DDR4 RAM
  • Intel XL710 40GbE NIC (firmware v5.05)
  • Reference iWARP Hardware Implementation
  • Dual Intel Xeon E5 2609 CPUs
  • 64 GB DDR3 RAM
  • Chelsio T580-LP-CR Unified Wire Ethernet controller (firmware v0.271.9472)
  • Applications used
  • perftest version 3.0+0.18.gb464d59-1
slide-16
SLIDE 16

16

Perftest Latency: urdma vs. Chelsio iWARP NIC

urdma (Software) Hardware

Worse Better

slide-17
SLIDE 17

17

Perftest Throughput: urdma vs. Chelsio iWARP NIC

urdma (Software) Hardware

Better Worse

slide-18
SLIDE 18

Summary

slide-19
SLIDE 19

19

Summary

  • urdma
  • Software emulation of RDMA
  • Runs unmodified RDMA verbs applications
  • Performs all data transfer in userspace
  • No dependency on specific hardware
  • Achieves reasonable performance
  • Future work
  • Zero copy sends?
  • Using urdma for NVMf traffic
  • Integration with emerging storage class memory technologies
slide-20
SLIDE 20

“ ”

Thanks!

Questions?

Patrick MacArthur <patrick@patrickmacarthur.net>

urdma download: https://github.com/zrlio/urdma