Performance Enhancement in OpenStack for Elastic Hadoop Cluster
CMSoft
Cloud Computing Products Department Lei Xu, Lina Hu, Weizhong Yu
OpenStack
Performance Enhancement in OpenStack for Elastic Hadoop Cluster - - PowerPoint PPT Presentation
OpenStack Performance Enhancement in OpenStack for Elastic Hadoop Cluster CMSoft Cloud Computing Products Department Lei Xu, Lina Hu, Weizhong Yu The Open Infrastructure Summit P A R T o n e Problems in High Performance Computing The
CMSoft
Cloud Computing Products Department Lei Xu, Lina Hu, Weizhong Yu
OpenStack
The Open Infrastructure Summit
P A R T
Problems in High Performance Computing
The Open Infrastructure Summit
BigData Team Cloud Team
Run Hadoop Cluster on OpenStack?
➢ Datanode service (hdfs) cannot run properly ? ➢ Poor computing performance. ➢ Network throughput not up to standard. ➢ VM…I think BM is more suitable for you
The Open Infrastructure Summit
DISK CPU/RAM NETWORK
◆ Yarn and Jstorm need greatest computing resource ◆ HDFS and HBase need stable storage resources ◆ All bigdata service need high throughput network resources
The Open Infrastructure Summit
Feature I :CPU Pin Feature III :SRIOV Port Feature IV :PCI Passthrough Feature II:HugePage
Traditional ways in OpenStack?
The Open Infrastructure Summit
How to find the best way to run bigdata service on VMs?
The Open Infrastructure Summit
P A R T t w o
Performance Enhancement in Disk
The Open Infrastructure Summit
① Use high performance cloud storage
High performance cloud storage such as SSD distributed storage or FC-SAN storage, can meet requirements for read & write IOPS and bandwidth requirements. But when the number of mounted disks is too large, performance degradation occurs. Moreover, cloud storage is affected by network quality, network jitter can affect disk read and write, and even cause system read-only.
② Use local disk storage
Use local disk on compute node server for instances, as data disk. Usually be able to meet IOPS and bandwidth requirements and more stable. The drawback is that the instance cannot be migrated.
The Open Infrastructure Summit
How to mount local disk to VMs in OpenStack?
Mount local disk on nova instances dictionary and use ephemeral in flavor to give local disk.
disk.local disk disk.swap disk.config disk.info console.log +----+-------+-----------+------+-----------+------+-------+-------------+-----------+ | ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public | +----+-------+-----------+------+-----------+------+-------+-------------+-----------+ | 1 | Hdfs1 | 8192 |200 | 500 | 0 | 4 | 1.0 | True | +----+-------+-----------+------+-----------+------+-------+-------------+-----------+
LV: instance
VG: nova
PV: /dev/sdl /dev/sdm
Flavor Flavor Instance Instance
Problems: ⚫ Use qcow2 file for ephemeral disk,file-backend will make a compromise with performance. ⚫ Cannot attach more than one ephemeral disk, just use LVM to meet large disk space need.
The Open Infrastructure Summit
How to mount local disk to VMs in OpenStack?
Reporting local disk device info and use virtio (virtio-blk) to mount disk to instance.
| cinder-volume | compute1@local-disk | nova | enabled | up | | cinder-volume | compute2@local-disk | nova | enabled | up |
/dev/sdb /dev/sdc /dev/sde /dev/sdm virtio virtio virtio virtio /dev/sdb virtio /dev/vdX /dev/vdX /dev/vdX /dev/vdX /dev/vdX
co compute1 co compute2 BlockDeviceDriver
Problems: ⚫ Cinder-volume service will increase with compute nodes, which will put pressure on Cinder and MQ. ⚫ Although multiple block devices can be mounted, there is no
performance will be degraded.
The Open Infrastructure Summit
How to mount local disk to VMs in OpenStack?
Passing through raid device including disks to instance.
/dev/sdb /dev/sdc /dev/sdd /dev/sdm /dev/sdb /dev/sdc /dev/sdd /dev/sdm /dev/sdb /dev/sdc /dev/sdd /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdy
co compute node
Instance2 Instance1 RAID RAID
PCI Passthrough
04:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] 04:00.1 RAID bus controller [0104]: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] [1000:005b] pci_passthrough_whitelist = [{"vendor_id": "1000", "product_id": "005b"}]
Problems: ⚫ PCI Passthrough with RAID device, cannot be mounted by disk unit.
The Open Infrastructure Summit
How to enhance the local disk mounting function?
Advantages : ⚫ The SCSI LUN transparent transmission mode uses virtio-scsi on the frontend of the virtual machine. The backend transfers the SCSI commands directly to the corresponding lun device. The IO path does not change.
<disk type='block' device='lun'> <driver name='qemu' type='raw' cache='none'/> <source dev='/dev/sdq'/> <target dev='sdd' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk>
SCSI LUN SCSI LUN Mounting
/dev/sdb /dev/sdc /dev/sde virtio-scsi /dev/sdX /dev/sdX /dev/sdX
The Open Infrastructure Summit
How to enhance the local disk mounting function?
Advantages : Iothread is a thread that handles IO separately, which is independent of the Qemu main event loop, thus reducing lock competition and interference from other emulated devices, focusing on responding to virtual machine IO events.
<iothreads>4</iothreads> <iothreadids> <iothread id='1'/> <iothread id='2'/> <iothread id='3'/> <iothread id='4'/> </iothreadids> <cputune> <shares>32768</shares> <iothreadpin iothread='1' cpuset='1'/> <iothreadpin iothread='2' cpuset='2'/> <iothreadpin iothread='3' cpuset='3'/> <iothreadpin iothread='4' cpuset='4'/> </cputune>
Iothread
setting
<controller type='scsi' index='0' model='virtio-scsi'> <driver iothread='1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </controller>
Virtio irtio-scsi con scsi controller pin troller pin
The Open Infrastructure Summit
Local Disk mounting in Nova
The user requests to add ssd/hdd disk Nova- API Nova- Scheduler Nova- Compute Libvirt driver Guest Domain The API sends disk nums to scheduler Choose proper compute nodes Set Iothread number User Update instance information Set Controller Iothread Pin Add Virtio-scsi driver Report local disk information
The Open Infrastructure Summit
50 100 150 200 250 300 350 400 4K Random Write 512K Random Write 4K Random Read 512K Random Read
Random Write & Read Test (IOPS)
Bare Disk PCI Passthrough SCSI Lun + Iothread Pin Virtio-blk+Iothread Pin 50 100 150 200 250 300 4K Sequential Write 512K Sequential Write 4K Sequential Read 512K Sequential Read
Sequential Write & Read Test (Bandwidth)
Bare Disk PCI Passthrough SCSI Lun + Iothread Pin Virtio-blk+Iothread Pin
SCSI LUN+Iothread: Large block (512k) performance is good, small block (4k) sequential read & write performance not bad. PCI Passthrough: 512k/4k random/ sequential write & read close to bare disk performance. Virtio-blk +Iothread: Random read & write is relatively good, sequential read & write is poor.
The Open Infrastructure Summit
PA R T t h r e e
Performance Enhancement in CPU/RAM
The Open Infrastructure Summit
Scale up/Vertically Scale: Physical machine/Virtual machine has abundant resources Physical machine/Virtual machine has no free resource Scale out/Horizontally Scale: Increase the number of virtual or physical machines
The Open Infrastructure Summit
Live Vertical Scaling up of VM Vertical scaling VM Added VM
live-resize cpu/ram
The Open Infrastructure Summit
Implementation method of vertical scaling Rest API
POST /servers/<id>/action { "live-resize" : { "flavorRef" : "2", } } Python-novaclient nova live-resize <server> <flavor>
<maxMemory slots='16' unit='KiB'>4194304</maxMemory> <memory unit='KiB'>1048576</memory> <currentMemory unit='KiB'>1048576</currentMemory> <vcpu placement='static' current='1'>4</vcpu>
The Open Infrastructure Summit
Live Resize in nova
The user requests to add ram/cpus nova- api nova- conductor nova- compute Libvirt driver Guest Domain The api sends resizable flavor to conductor Get instance and flavor from database Check state of instance Check on live-resize constraints Set vcpu number Add memory number Update instance metadata User
The Open Infrastructure Summit
P A R T f o u r
Performance Enhancement in Network
The Open Infrastructure Summit
How to enhance network performance in OpenStack?
Use ovs-dpdk acceleration in Neutron. OVS-DPDK transfers network traffic forwarding from Kernel Model to User Model.
The Open Infrastructure Summit
How to enhance network performance in OpenStack?
Use sriov port in Neutron and give child baremetal nic to instances. Make several VFs from PF (Physical NIC), and Passthrough to VMs with Intel VT-d technology.
The Open Infrastructure Summit
How to make sriov bond port in OpenStack?
Nic Nic Nic Nic
vNic vNic vNic vNic vNic vNic vNic vNic
MLAG MLAG LACP LACP Compute Compute Node Node VM VM VM VM
A-B R-R
SRIOV-AB SRIOV-LACP
⚫ LACP bond for physical NICs, A- B bond or R-R bond for vNICs . ⚫ vNICs should be placed on different network cards.
The Open Infrastructure Summit
SRIOV Bond in OpenStack
Requests to create vms with sriov ports Nova- API Nova- Scheduler Nova- Compute Libvirt driver Guest Domain Sends sriov pci needs to scheduler Choose proper compute nodes Choose vf nics in different pf nics User Add VLAN id in XML Report sriov pci information Neutron- API The user requests to create twos ports and add port anti affinity Add host-dev device for sriov port
The Open Infrastructure Summit
2 4 6 8 10 12 64K 128K 256K 512K
Thoughout Test (Different Network)
SRIOV OVS-DPDK Normal OVS 2 4 6 8 10 12 14 16 18 20 64K 128K 256K 512K
Thoughout Test(Different Mode)
SRIOV(LACP) SRIOV(A-B)
SRIOV(LACP): Thanks to lacp bond, sriov(lacp) can give 1.5x network performance than single sriov. OVS-DPDK:
for 512Kb packet, ovs-dpdk is similar to physical network. SRIOV: Performance for sriov is similar to physical network.
The Open Infrastructure Summit
Demo for Local Disk Mounting Demo for CPU/RAM Hot Plugging Demo for SRIOV Bond Port
The Open Infrastructure Summit
Demo for Local Disk Mounting
The Open Infrastructure Summit
Demo for CPU/RAM Hot Plugging
The Open Infrastructure Summit
Demo for SRIOV Bond Port
The Open Infrastructure Summit