Performance Enhancement in OpenStack for Elastic Hadoop Cluster - - PowerPoint PPT Presentation

performance enhancement in
SMART_READER_LITE
LIVE PREVIEW

Performance Enhancement in OpenStack for Elastic Hadoop Cluster - - PowerPoint PPT Presentation

OpenStack Performance Enhancement in OpenStack for Elastic Hadoop Cluster CMSoft Cloud Computing Products Department Lei Xu, Lina Hu, Weizhong Yu The Open Infrastructure Summit P A R T o n e Problems in High Performance Computing The


slide-1
SLIDE 1

Performance Enhancement in OpenStack for Elastic Hadoop Cluster

CMSoft

Cloud Computing Products Department Lei Xu, Lina Hu, Weizhong Yu

OpenStack

slide-2
SLIDE 2

The Open Infrastructure Summit

P A R T

  • n e

Problems in High Performance Computing

slide-3
SLIDE 3

The Open Infrastructure Summit

BigData Team Cloud Team

Run Hadoop Cluster on OpenStack?

➢ Datanode service (hdfs) cannot run properly ? ➢ Poor computing performance. ➢ Network throughput not up to standard. ➢ VM…I think BM is more suitable for you

slide-4
SLIDE 4

The Open Infrastructure Summit

DISK CPU/RAM NETWORK

◆ Yarn and Jstorm need greatest computing resource ◆ HDFS and HBase need stable storage resources ◆ All bigdata service need high throughput network resources

slide-5
SLIDE 5

The Open Infrastructure Summit

Feature I :CPU Pin Feature III :SRIOV Port Feature IV :PCI Passthrough Feature II:HugePage

Traditional ways in OpenStack?

slide-6
SLIDE 6

The Open Infrastructure Summit

How to find the best way to run bigdata service on VMs?

slide-7
SLIDE 7

The Open Infrastructure Summit

P A R T t w o

Performance Enhancement in Disk

slide-8
SLIDE 8

The Open Infrastructure Summit

① Use high performance cloud storage

High performance cloud storage such as SSD distributed storage or FC-SAN storage, can meet requirements for read & write IOPS and bandwidth requirements. But when the number of mounted disks is too large, performance degradation occurs. Moreover, cloud storage is affected by network quality, network jitter can affect disk read and write, and even cause system read-only.

② Use local disk storage

Use local disk on compute node server for instances, as data disk. Usually be able to meet IOPS and bandwidth requirements and more stable. The drawback is that the instance cannot be migrated.

slide-9
SLIDE 9

The Open Infrastructure Summit

How to mount local disk to VMs in OpenStack?

  • 1. Use Ephemeral Storage

Mount local disk on nova instances dictionary and use ephemeral in flavor to give local disk.

disk.local disk disk.swap disk.config disk.info console.log +----+-------+-----------+------+-----------+------+-------+-------------+-----------+ | ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | +----+-------+-----------+------+-----------+------+-------+-------------+-----------+ | 1 | Hdfs1 | 8192 |200 | 500 | 0 | 4 | 1.0 | True | +----+-------+-----------+------+-----------+------+-------+-------------+-----------+

  • > Dictionary: /var/lib/nova/instances
  • > LV

LV: instance

  • > VG

VG: nova

  • > PV

PV: /dev/sdl /dev/sdm

Flavor Flavor Instance Instance

Problems: ⚫ Use qcow2 file for ephemeral disk,file-backend will make a compromise with performance. ⚫ Cannot attach more than one ephemeral disk, just use LVM to meet large disk space need.

slide-10
SLIDE 10

The Open Infrastructure Summit

How to mount local disk to VMs in OpenStack?

  • 2. Use Cinder BlockDeviceDriver

Reporting local disk device info and use virtio (virtio-blk) to mount disk to instance.

| cinder-volume | compute1@local-disk | nova | enabled | up | | cinder-volume | compute2@local-disk | nova | enabled | up |

/dev/sdb /dev/sdc /dev/sde /dev/sdm virtio virtio virtio virtio /dev/sdb virtio /dev/vdX /dev/vdX /dev/vdX /dev/vdX /dev/vdX

co compute1 co compute2 BlockDeviceDriver

Problems: ⚫ Cinder-volume service will increase with compute nodes, which will put pressure on Cinder and MQ. ⚫ Although multiple block devices can be mounted, there is no

  • ptimization measures,

performance will be degraded.

slide-11
SLIDE 11

The Open Infrastructure Summit

How to mount local disk to VMs in OpenStack?

  • 3. Use PCI Passthrough in OpenStack

Passing through raid device including disks to instance.

/dev/sdb /dev/sdc /dev/sdd /dev/sdm /dev/sdb /dev/sdc /dev/sdd /dev/sdm /dev/sdb /dev/sdc /dev/sdd /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdy

co compute node

Instance2 Instance1 RAID RAID

PCI Passthrough

04:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] 04:00.1 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] pci_passthrough_whitelist = [{"vendor_id": "1000", "product_id": "005b"}]

Problems: ⚫ PCI Passthrough with RAID device, cannot be mounted by disk unit.

slide-12
SLIDE 12

The Open Infrastructure Summit

How to enhance the local disk mounting function?

  • 1. Use SCSI LUN passthrough instead of PCI passthrough

Advantages : ⚫ The SCSI LUN transparent transmission mode uses virtio-scsi on the frontend of the virtual machine. The backend transfers the SCSI commands directly to the corresponding lun device. The IO path does not change.

<disk type='block' device='lun'> <driver name='qemu' type='raw' cache='none'/> <source dev='/dev/sdq'/> <target dev='sdd' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk>

SCSI LUN SCSI LUN Mounting

  • unting

/dev/sdb /dev/sdc /dev/sde virtio-scsi /dev/sdX /dev/sdX /dev/sdX

slide-13
SLIDE 13

The Open Infrastructure Summit

How to enhance the local disk mounting function?

  • 2. Use Iothread Pin for multiple device mounting

Advantages : Iothread is a thread that handles IO separately, which is independent of the Qemu main event loop, thus reducing lock competition and interference from other emulated devices, focusing on responding to virtual machine IO events.

<iothreads>4</iothreads> <iothreadids> <iothread id='1'/> <iothread id='2'/> <iothread id='3'/> <iothread id='4'/> </iothreadids> <cputune> <shares>32768</shares> <iothreadpin iothread='1' cpuset='1'/> <iothreadpin iothread='2' cpuset='2'/> <iothreadpin iothread='3' cpuset='3'/> <iothreadpin iothread='4' cpuset='4'/> </cputune>

Iothread

  • thread setting

setting

<controller type='scsi' index='0' model='virtio-scsi'> <driver iothread='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </controller>

Virtio irtio-scsi con scsi controller pin troller pin

slide-14
SLIDE 14

The Open Infrastructure Summit

Local Disk mounting in Nova

The user requests to add ssd/hdd disk Nova- API Nova- Scheduler Nova- Compute Libvirt driver Guest Domain The API sends disk nums to scheduler Choose proper compute nodes Set Iothread number User Update instance information Set Controller Iothread Pin Add Virtio-scsi driver Report local disk information

slide-15
SLIDE 15

The Open Infrastructure Summit

50 100 150 200 250 300 350 400 4K Random Write 512K Random Write 4K Random Read 512K Random Read

Random Write & Read Test (IOPS)

Bare Disk PCI Passthrough SCSI Lun + Iothread Pin Virtio-blk+Iothread Pin 50 100 150 200 250 300 4K Sequential Write 512K Sequential Write 4K Sequential Read 512K Sequential Read

Sequential Write & Read Test (Bandwidth)

Bare Disk PCI Passthrough SCSI Lun + Iothread Pin Virtio-blk+Iothread Pin

SCSI LUN+Iothread: Large block (512k) performance is good, small block (4k) sequential read & write performance not bad. PCI Passthrough: 512k/4k random/ sequential write & read close to bare disk performance. Virtio-blk +Iothread: Random read & write is relatively good, sequential read & write is poor.

slide-16
SLIDE 16

The Open Infrastructure Summit

PA R T t h r e e

Performance Enhancement in CPU/RAM

slide-17
SLIDE 17

The Open Infrastructure Summit

Elastic expansion Auto-scaling

Scale up/Vertically Scale: Physical machine/Virtual machine has abundant resources Physical machine/Virtual machine has no free resource Scale out/Horizontally Scale: Increase the number of virtual or physical machines

slide-18
SLIDE 18

The Open Infrastructure Summit

Live Vertical Scaling up of VM Vertical scaling VM Added VM

live-resize cpu/ram

slide-19
SLIDE 19

The Open Infrastructure Summit

Implementation method of vertical scaling Rest API

POST /servers/<id>/action { "live-resize" : { "flavorRef" : "2", } } Python-novaclient nova live-resize <server> <flavor>

<maxMemory slots='16' unit='KiB'>4194304</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='1'>4</vcpu>

slide-20
SLIDE 20

The Open Infrastructure Summit

Live Resize in nova

The user requests to add ram/cpus nova- api nova- conductor nova- compute Libvirt driver Guest Domain The api sends resizable flavor to conductor Get instance and flavor from database Check state of instance Check on live-resize constraints Set vcpu number Add memory number Update instance metadata User

slide-21
SLIDE 21

The Open Infrastructure Summit

P A R T f o u r

Performance Enhancement in Network

slide-22
SLIDE 22

The Open Infrastructure Summit

How to enhance network performance in OpenStack?

  • 1. OVS/OVS-DPDK

Use ovs-dpdk acceleration in Neutron. OVS-DPDK transfers network traffic forwarding from Kernel Model to User Model.

slide-23
SLIDE 23

The Open Infrastructure Summit

How to enhance network performance in OpenStack?

  • 2. SRIOV

Use sriov port in Neutron and give child baremetal nic to instances. Make several VFs from PF (Physical NIC), and Passthrough to VMs with Intel VT-d technology.

slide-24
SLIDE 24

The Open Infrastructure Summit

How to make sriov bond port in OpenStack?

Nic Nic Nic Nic

vNic vNic vNic vNic vNic vNic vNic vNic

MLAG MLAG LACP LACP Compute Compute Node Node VM VM VM VM

A-B R-R

SRIOV-AB SRIOV-LACP

⚫ LACP bond for physical NICs, A- B bond or R-R bond for vNICs . ⚫ vNICs should be placed on different network cards.

slide-25
SLIDE 25

The Open Infrastructure Summit

SRIOV Bond in OpenStack

Requests to create vms with sriov ports Nova- API Nova- Scheduler Nova- Compute Libvirt driver Guest Domain Sends sriov pci needs to scheduler Choose proper compute nodes Choose vf nics in different pf nics User Add VLAN id in XML Report sriov pci information Neutron- API The user requests to create twos ports and add port anti affinity Add host-dev device for sriov port

slide-26
SLIDE 26

The Open Infrastructure Summit

2 4 6 8 10 12 64K 128K 256K 512K

Thoughout Test (Different Network)

SRIOV OVS-DPDK Normal OVS 2 4 6 8 10 12 14 16 18 20 64K 128K 256K 512K

Thoughout Test(Different Mode)

SRIOV(LACP) SRIOV(A-B)

SRIOV(LACP): Thanks to lacp bond, sriov(lacp) can give 1.5x network performance than single sriov. OVS-DPDK:

  • vs-dpdk Increased by 300% than normal ovs and

for 512Kb packet, ovs-dpdk is similar to physical network. SRIOV: Performance for sriov is similar to physical network.

slide-27
SLIDE 27

The Open Infrastructure Summit

D E M O

Demo for Local Disk Mounting Demo for CPU/RAM Hot Plugging Demo for SRIOV Bond Port

slide-28
SLIDE 28

The Open Infrastructure Summit

Demo for Local Disk Mounting

slide-29
SLIDE 29

The Open Infrastructure Summit

Demo for CPU/RAM Hot Plugging

slide-30
SLIDE 30

The Open Infrastructure Summit

Demo for SRIOV Bond Port

slide-31
SLIDE 31

The Open Infrastructure Summit

Thank For Watching !