io_uring in QEMU: high-performance disk IO for Linux FOSDEM 2020 - - PowerPoint PPT Presentation

io uring in qemu high performance disk io for linux
SMART_READER_LITE
LIVE PREVIEW

io_uring in QEMU: high-performance disk IO for Linux FOSDEM 2020 - - PowerPoint PPT Presentation

io_uring in QEMU: high-performance disk IO for Linux FOSDEM 2020 Julia Suvorova, Red Hat Software Engineer 1 Agenda What well discuss today io_uring API QEMU structure Features of io_uring and how they helped QEMU Benchmarks What


slide-1
SLIDE 1

FOSDEM 2020

io_uring in QEMU: high-performance disk IO for Linux

Julia Suvorova, Red Hat Software Engineer

1

slide-2
SLIDE 2

What we’ll discuss today

Agenda

2

io_uring API QEMU structure Features of io_uring and how they helped QEMU Benchmarks What left to do

slide-3
SLIDE 3

QEMU I/O path

I/O path in VM

3

VM Host vHW HW userspace userspace kernel kernel QEMU driver virtio-blk

slide-4
SLIDE 4

QEMU I/O path

Existing solutions

4

Async I/O ▸ Linux AIO (aio=native) ▸ Thread pool (aio=threads) Other ▸ NVME passthrough (vfio) ▸ SPDK

slide-5
SLIDE 5

QEMU I/O path

I/O path in VM

5

  • ---------------This part we can improve--------------

VM Host vHW HW userspace userspace kernel kernel QEMU driver virtio-blk

slide-6
SLIDE 6

io_uring interface

io_uring

6

Yet another kernel ring buffer ▸ New interface for truly asynchronous communication with kernel: latest versions support network and some other syscalls ▸ Part of linux 5.1

slide-7
SLIDE 7

io_uring interface

Main features

7

▸ Unlike Linux AIO, separate queues for submission and completion (sqes and cqes) ▸ Sqes and cqes are shared between userspace and kernel ▸ Async flush Submission: QEMU -> kernel -> hw Completion: QEMU <- kernel <- hw

slide-8
SLIDE 8

io_uring interface

Interface

8

Three new system calls: io_uring_setup(u32 entries, struct io_uring_params *p) ▸ Can choose different regimes io_uring_enter(unsigned int fd, unsigned int to_submit, unsigned int min_complete, unsigned int flags, sigset_t *sig) ▸ Submit submissions and fetches completions within one syscall (Not in Linux AIO!) io_uring_register(unsigned int fd, unsigned int opcode, void *arg, unsigned int nr_args); ▸ Register fd ahead. No need to do fget() and fput() on each submission and completion respectively ▸ Register buffers (struct iovec) ahead. Saves get_user_pages() and put_pages()

slide-9
SLIDE 9

io_uring interface

How fast is it?

9

Benchmarks on bare metal

Test with fio 3.14: aio=libaio

  • peration=randread

NVMe SSD Intel Optane 320G CPU Intel Xeon Silver 2.20GHz

slide-10
SLIDE 10

io_uring inside QEMU

Integration into QEMU

10

What’s done: ▸ Outreachy project idea ▸ Implemented by Aarushi Mehta ▸ Basic functionality is merged upstream (will be in QEMU 5.0) Known issues: ▸ Problems with file locking in fd registration ▸ IOPOLL is not implemented

slide-11
SLIDE 11

io_uring inside QEMU

Integration into QEMU

11

Reuse Linux AIO approach

Qemu event loop is based on AIO context (future improvement: can be switched to io_uring) Add aio context -> use epoll for completion check Now we submit requests with io_uring_enter() and check completions on irq Liburing usage: Easier to use, less mistakes

slide-12
SLIDE 12

io_uring inside QEMU

Integration into QEMU

12

How to launch

  • drive file=test.img,format=raw,cache=none,aio=io_uring

Works with both IO_DIRECT and cache workload

slide-13
SLIDE 13

io_uring inside QEMU

How fast has it got without extra features?

13

Test with fio 3.14: aio=libaio

  • peration=randread

NVMe SSD Intel Optane 320G CPU Intel Xeon Silver 2.20GHz

slide-14
SLIDE 14

Features and benchmarks

Fd registration

14

Register set of fd on which I/O is operated with io_uring_register() Saves atomic fget() on submission path Saves atomic fput() on completion path

slide-15
SLIDE 15

Features and benchmarks

Does this help much?

15

Not really by itself

Test with fio 3.14: aio=libaio

  • peration=randread

NVMe SSD Intel Optane 320G CPU Intel Xeon Silver 2.20GHz

slide-16
SLIDE 16

Features and benchmarks

Submission polling

16

Run a kernel thread to wait for submissions, need to wake up with syscall io_uring_setup() with flag SQ_POLL Needs fd registration for effective usage Now we submit requests without syscall and get completions on irq - path without syscalls

slide-17
SLIDE 17

Features and benchmarks

Completion polling

17

Poll completions with busy waiting on io_uring_enter() io_uring_setup() with CPU consuming, but no context switching In combination with SQ_POLL - the fastest way on heavy workloads

slide-18
SLIDE 18

Features and benchmarks

Performance

18 Source: Insert source data here Insert source data here Insert source data here

Not implemented yet

slide-19
SLIDE 19

In someone’s todo

Future improvements

19

Merge SQ_POLL and fd registration File buffers registration and IO_POLL Switch to io_uring as default aio (if supported) Ideas: Switch main loop to io_uring

slide-20
SLIDE 20

linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHat

Thank you

20

Questions?