01 / 16
A Design for Comprehensive Kernel Instrumentation
1
Peter Feiner Angela Demke Brown Ashvin Goel
peter@cs.toronto.edu demke@cs.toronto.edu ashvin@eecg.toronto.edu
University of Toronto
A Design for Comprehensive Kernel Instrumentation Peter Feiner - - PowerPoint PPT Presentation
A Design for Comprehensive Kernel Instrumentation Peter Feiner Angela Demke Brown Ashvin Goel peter@cs.toronto.edu demke@cs.toronto.edu ashvin@eecg.toronto.edu University of Toronto 01 / 16 1 Motivation Transparent fault isolation for
01 / 16
1
Peter Feiner Angela Demke Brown Ashvin Goel
peter@cs.toronto.edu demke@cs.toronto.edu ashvin@eecg.toronto.edu
University of Toronto
01 / 16
Transparent fault isolation for device drivers
Inspired by Byte Granularity Isolation
Use Dynamic Binary Instrumentation (DBI)
2
x86 Driver Code Kernel
01 / 16
Transparent fault isolation for device drivers
Inspired by Byte Granularity Isolation
Use Dynamic Binary Instrumentation (DBI)
2
x86 Driver Code Instrumented Driver Kernel
01 / 16
Transparent fault isolation for device drivers
Inspired by Byte Granularity Isolation
Use Dynamic Binary Instrumentation (DBI)
2
x86 Driver Code x86 Driver Code Instrumented Driver Kernel DBI
01 / 16
DBI applied for debugging and security at the user level
Various user-level DBI frameworks are available
Valgrind, DynamoRIO, Pin
These frameworks don’t work in the kernel
3
01 / 16
OS
User frameworks sit between applications and the OS
Kernel frameworks need to sit between the OS & CPU
4
A p p s
DBI
CPU
01 / 16
We need to combine a DBI framework with a hypervisor
similar to SecVisor’s approach
We designed a minimal hypervisor around a DBI framework
5
O S
Kernel DBI
Apps CPU
01 / 16
Copy basic blocks of x86 code into code cache before execution
control to the dispatcher
6
Execute from Code Cache Dispatch Cached? Copy Block No Yes x86 Code Start
01 / 16
Never execute machine’s original code
Hide framework from instrumented code
Dispatcher should use instrumented code with care
Detect changes to the original code
Preserve multicore concurrency
7
01 / 16
We’ll look at the first three in more detail
8
User Kernel Never Execute Original Code New Threads, Signals Kernel Entry Points Transparency Signals Interrupts, Exceptions Reentrance Use OS Code Implement Everything From Scratch Detect Code Changes System Calls mmap, mprotect, etc. Shadow Page Tables Concurrency Locking, Thread Private CPU Private
01 / 16
9
User Code Exceptions Interrupts OS Binaries (kernel, drivers) User Mode Supervisor Mode
01 / 16
9
User Code Exceptions Interrupts User Mode Supervisor Mode Dispatcher Code Cache
entry points to the dispatcher OS Binaries (kernel, drivers)
01 / 16
10
Entry 1 Entry 2 Table Register OS Binaries
Descriptor Table
01 / 16
10
Entry 1 Entry 2 Table Register Dispatcher OS Binaries Entry 1 Entry 2 Code Cache
Descriptor Table Shadow Table
01 / 16
10
Entry 1 Entry 2 Table Register Dispatcher OS Binaries Entry 1 Entry 2 Code Cache
Descriptor Table Shadow Table
01 / 16
10
Entry 1 Entry 2 Table Register Dispatcher OS Binaries Entry 1 Entry 2 Shadow Register Code Cache
Descriptor Table Shadow Table
01 / 16
Need to hide DBI framework from instrumented code
Many transparency issues, including
11
01 / 16
Dispatching kernel’s exception handlers is tricky because they inspect machine state
Solution for interrupt handlers is similar
12
01 / 16
13
H = Interrupt Handler I = Instrumentation = Interrupt
01 / 16
13
A H = Interrupt Handler I = Instrumentation = Interrupt Original Code
01 / 16
13
A H = Interrupt Handler I = Instrumentation = Interrupt Original Code
01 / 16
13
IH A H = Interrupt Handler I = Instrumentation = Interrupt Original Code
01 / 16
13
IH A A H = Interrupt Handler I = Instrumentation = Interrupt Original Code
01 / 16
13
IH B A A H = Interrupt Handler I = Instrumentation = Interrupt Original Code
01 / 16
13
Copy A IH B A A H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher
01 / 16
13
A Copy A IH B A A A H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache Original Addresses
01 / 16
13
A Copy A IH B A A A I H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache Original Addresses
01 / 16
13
A Copy A IH B A A A I H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache Original Addresses
01 / 16
Delay interrupts until next code-cache exit
13
A Copy A IH B A A A I H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache Original Addresses
01 / 16
Delay interrupts until next code-cache exit
13
A Copy A IH B A A A I H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache Original Addresses
01 / 16
Delay interrupts until next code-cache exit
13
A Copy A IH B A A A I A H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache A Original Addresses
01 / 16
Delay interrupts until next code-cache exit
13
A Copy A IH B A A A I A H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache A Original Addresses
01 / 16
Delay interrupts until next code-cache exit
Copy IH
13
A Copy A IH B A A A I A H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache A Original Addresses
01 / 16
Delay interrupts until next code-cache exit
Copy IH
13
A IH Copy A IH B A A A I A IH H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache A Original Addresses
01 / 16
Delay interrupts until next code-cache exit
Copy IH
13
A IH Copy A Copy B IH B A A A I A IH H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache A Original Addresses
01 / 16
Delay interrupts until next code-cache exit
Copy IH
13
A IH Copy A Copy B IH B A A B A I A IH B H = Interrupt Handler I = Instrumentation = Interrupt Original Code Dispatcher Code Cache A Original Addresses
01 / 16
Code is not reentrant if it is unsafe to execute before
because the non-reentrant code might be currently executing
Say, print consists of basic blocks P1, P2
14
Copy P2 P1 P1
Dispatcher Code Cache
01 / 16
Typical solution is to reimplement non-reentrant code using lower-level uninstrumented code
OS-level framework has no lower-level code
Some code too difficult to implement from scratch
15
01 / 16
We chose to port DynamoRIO to a minimal hypervisor because it is
Applications
We will open source our port!
16
01 / 16
17
01 / 16
VMWare
PinOS
VM
Neither is open source
18
01 / 16
Simpler than a full-fledged hypervisor
restrictive permissions
for CPU-private data
19
01 / 16
Once booted, OS runs exclusively in 64-bit long mode
Can store dispatcher and code cache in pages that are in all page tables at the same virtual addresses
Design should work with OS that meets assumptions
20
01 / 16
Do not make implementation simpler
Could make implementation more complex
Could improve performance
21