Quest-V – a Virtualized Multikernel
Richard West richwest@cs.bu.edu Ye Li, Eric Missimer {liye, missimer}@cs.bu.edu
Computer Science
Quest-V a Virtualized Multikernel Richard West richwest@cs.bu.edu - - PowerPoint PPT Presentation
Quest-V a Virtualized Multikernel Richard West richwest@cs.bu.edu Ye Li, Eric Missimer {liye, missimer}@cs.bu.edu Computer Science Goals Develop system for high-confidence (embedded) systems Predictable real-time support
Computer Science
2/32
3/32
4/32
– Loss of spacecraft due to Imperial / Metric conversion error (September 23, 1999)
5 rocket – June 4, 1996 rocket destroyed during flight – Conversion error from 64-bit double to 16-bit value
Canada in 2003 without electricity due to software race condition
5/32
– Distributed system on a chip – Time as a first-class resource
– Separate sandbox kernels for system sub-components – Isolation using h/w-assisted memory virtualization
– Security enforcible using VT-d + interrupt remapping (IR)
6/32
Sandbox M
Main VCPU IO VCPU
Kernel Apps CPU M Monitor
Shared Mem / Msg Channel
. . .
Sandbox 1
Main VCPU IO VCPU
Kernel Apps CPU 1
. . .
Migration
Monitor
Shared Drivers
Sandbox 2
Main VCPU IO VCPU
Kernel Apps CPU 2 Monitor
7/32
8/32
9/32
BIOS Sandbox Kernel 1 Shared Driver EPT Data Structure 1 Sandbox Kernel M Shared Driver EPT Data Structure M . . . User Space Shared Memory Region 0x00000000 0xFFFFFFFF Monitor 1 Monitor M . . . Sandbox Kernel M Shared Driver EPT Data Structure M Monitor M User Space Shared Memory Region Physical Memory Layout Virtual Memory Layout Sandbox M . . . Sandbox Kernel 1 Shared Driver EPT Data Structure 1 Monitor 1 User Space Shared Memory Region Virtual Memory Layout Sandbox 1 . . . 0x00000000 0xFFFFFFFF
10/32
11/32
Main VCPUs I/O VCPUs Threads PCPUs (Cores, HTs)
12/32
13/32
14/32
V,IO
V,main* U V,IO for period T V,main
15/32
e = t + Cactual / U V,IO
16/32
17/32
is not greater than that caused by an equivalent periodic task (1) Replenishment, R must be deferred at least t+TV (2) Can be deferred longer (3) Can merge two overlapping replenishments
R1.time
18/32
1 10 10 20,00 00,00 00,00 17 20 30 40 50 1 10 1 16 1 60 70 80 10 90 100 12 8 110 02,00 18,50 00,00 02,40 18,50 00,00 18,50 02,90 00,00 02,50 02,90 16,100 02,80 02,90 16,100 02,90 16,100 02,130 16,100 02,130 02,140 1 10 10 17 20 30 40 50 60 70 80 90 100 110 1 10 17 1 10 17 amount , time Replenishment Queue Element VCPU 0 (C=10, T=40, Start=1) VCPU 1 (C=20, T=50, Start=0) Premature Replenishment Corrected Algorithm 2 IOVCPU (Utilization=4%) 2 2 2 (A) (B)
Interval [t=0,100] (A) VCPU 1 = 40%, (B) VCPU 1 = 46%
19/32
i=0 n−1 Ci
Ti +∑
j=0 m−1
(2−Uj) ⋅Uj≤n⋅ (
n
√2−1)
20/32
21/32
Sandbox M
Main VCPU IO VCPU
Kernel Apps CPU M Monitor
Shared Mem / Msg Channel
. . .
Sandbox 1
Main VCPU IO VCPU
Kernel Apps CPU 1
. . .
Migration
Monitor
Shared Drivers
Sandbox 2
Main VCPU IO VCPU
Kernel Apps CPU 2 Monitor
I/O Device (e.g., NIC)
22/32
23/32
24/32
Linux Xen (PVM) Xen (HVM)
100 200 300 400 500 600 700 800 900 1000
Netperf UDP Throughput Test
1xNetperf 2xNetperf 4xNetperf Quest
UDP Throughput (Mbps)
25/32
Main VCPU IO VCPU Kernel Monitor NIC Driver Main VCPU Kernel Monitor NIC Driver Msg Channel Msg Channel NIC (1) Send Msg Main VCPU Kernel Monitor NIC Driver Msg Channel Receive Msg IO VCPU (2) (3) (4) Component Failure Detection SB Kernel (Guest) Monitor (Host) VM-Exit VM-Entry Fault Identification And Handling Remote Event Notification via IPI Component Recovery in Remote Sandbox Component Recovery in Local Sandbox (1) (2) (3) (4)
26/32
Realtek NIC driver fault
under normal operation – Single-threaded server – Focus on one process – Recovery time rather than throughput
27/32
Recovery Phases CPU Cycles Local Recovery Remote Recovery
VM-Exit 885 Driver Switch 10503 N/A IPI Round Trip N/A 4542 VM-Enter 663 Driver Re-initialization 1.45E+07 Network Re- initialization 78351
28/32
– High rate VCPUs: 50/100ms – Low rate VCPUs: 40/100ms
29/32
30/32
31/32
32/32
33/32
– SB1 sends msgs to SB0, SB2 & SB3 at 50ms intervals
respectively
– SB0 handles ICMP requests
– Observe failure + recovery in SB0 – Messaging threads on Main VCPUs: 20ms/100ms – NIC driver I/O VCPU: 1ms/10ms
34/32
35/32
36/32
U dest+Csrc T src ≤(n+1)(
n+1
√2−1),∣V dest∣=n@t '<t
37/32
Make migration decision (Find destination) SB Kernel (Guest) Monitor (Host) VM-Exit VM-Entry Push quest_tss address(es) to destination Copy quest_tss structure(s) Resume local scheduling Resume local scheduling (1) (2) (3) (4) Move addr space and VCPU from source (5) Migration thread event received Main VCPU IO VCPU Kernel Monitor Scheduler Main VCPU Kernel Monitor (1) Main VCPU Kernel Monitor Scheduler Main VCPU (2) (3) (4) (5) IO VCPU Migration Thread Scheduler
38/32
int VCPU_create(struct vcpu_param *param) struct vcpu_param { int vcpuid; policy; // SCHED_SPORADIC, SCHED_PIBS int mask; // affinity mask int C; // budget int T; // period }
39/32
*param);
– Which sandboxes assigned which VCPUs?
40/32
41/32
Fault Recovery Thread Exit Code Entry Code Restore Machine State for Recovery Code Start / Continue Recovery Procedure Monitor LAPIC Timer Handler Save Machine State for Recovery Code LAPIC Timer Interrupt Schedule De-schedule Sandbox Kernel Monitor
42/32
43/32
l - E/C*M
44/32
45/32
46/32
VCPU VC VT threads VCPU0 2 5 CPU-bound VCPU1 2 8 Reading CD, CPU-bound VCPU2 1 4 CPU-bound VCPU3 1 10 Logging, CPU- bound IOVCPU 10% ATA
47/32
48/32
VCPU VC VT threads VCPU0 1 20 CPU-bound VCPU1 1 30 CPU-bound VCPU2 10 100 Network, CPU- bound VCPU3 20 100 Logging, CPU- bound IOVCPU 1% Network
49/32
t=50 start ICMP ping flood. Here, we see comparison overheads of two scheduling policies
50/32
Network bandwidth of two scheduling policies
51/32
VCPU VC VT threads VCPU0 30 100 USB, CPU-bound VCPU1 10 110 CPU-bound VCPU2 10 90 Network, CPU-bound VCPU3 100 200 Logging, CPU-bound IO VCPU 1% USB,Network VCPU0 30 100 USB, CPU-bound VCPU1 10 110 CPU-bound VCPU2 10 90 Network, CPU-bound VCPU3 100 200 Logging, CPU-bound IO VCPU1 1% USB IO VCPU2 1% Network
52/32
53/32
54/32
55/32
Main VCPU Main VCPU IO VCPU IO VCPU
Kernel Monitor Monitor
NIC Driver Main VCPU Main VCPU
Kernel Monitor Monitor
NIC Driver Msg Channel Msg Channel
NIC NIC
(1)
Main VCPU Main VCPU
Kernel Monitor Monitor
NIC Driver Msg Channel IO VCPU IO VCPU
(2) (3) (4) Component Failure Detection Component Failure Detection SB Kernel (Guest) Monitor (Host) VM-Exit V M
n t r y Fault Identification And Handling Fault Identification And Handling Remote Event Notification (IPI) Remote Event Notification (IPI) Component Recovery in Remote Sandbox Component Recovery in Remote Sandbox Component Recovery in Local Sandbox Component Recovery in Local Sandbox (1) (2) (3) (4)