Debugging a Memory Manager Karl Cronburg karl@cs.tufts.edu Tufts - - PowerPoint PPT Presentation

debugging a memory manager
SMART_READER_LITE
LIVE PREVIEW

Debugging a Memory Manager Karl Cronburg karl@cs.tufts.edu Tufts - - PowerPoint PPT Presentation

Debugging a Memory Manager Karl Cronburg karl@cs.tufts.edu Tufts University The Problem How do we guarantee correctness of a memory management system? Difficulties include: Complex garbage collection (GC) algorithms Static analysis


slide-1
SLIDE 1

Debugging a Memory Manager

Karl Cronburg karl@cs.tufts.edu

Tufts University

slide-2
SLIDE 2

The Problem

How do we guarantee correctness of a memory management system? Difficulties include:

◮ Complex garbage collection (GC) algorithms ◮ Static analysis computationally infeasible ◮ Loss of type information ◮ Implicit memory layouts

(only described in code comments)

◮ Pointer safety

xkcd.com/138

slide-3
SLIDE 3

Motivation

◮ Growing popularity of memory-safe systems ◮ Someone has to implement and debug these systems ◮ Ensuring that the memory manager

◮ respects application-system boundaries ◮ handles its own memory appropriately

◮ It matters which code is touching which parts of memory and

when

slide-4
SLIDE 4

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

slide-5
SLIDE 5

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

slide-6
SLIDE 6

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

◮ Time consuming & tedious - custom analysis of log files

slide-7
SLIDE 7

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down

slide-8
SLIDE 8

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files

slide-9
SLIDE 9

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files ◮ Ad-hoc - correctness of debugging assertions

slide-10
SLIDE 10

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files ◮ Ad-hoc - correctness of debugging assertions ◮ Incomplete - weak guarantee of program correctness

slide-11
SLIDE 11

Background: Existing Debugging Techniques

◮ Printf / log-based ◮ Sanity checking / assertions

Why are these techniques unsatisfactory?

◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files ◮ Ad-hoc - correctness of debugging assertions ◮ Incomplete - weak guarantee of program correctness ◮ Lack of isolation

slide-12
SLIDE 12

Background: Existing Debugging Tools

◮ General purpose:

◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection 1 2

slide-13
SLIDE 13

Background: Existing Debugging Tools

◮ General purpose:

◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection

◮ Memory manager specific tools:

◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool 1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013

slide-14
SLIDE 14

Background: Existing Debugging Tools

◮ General purpose:

◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection

◮ Memory manager specific tools:

◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool

◮ Other system & language specific tools:

◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)

So why are these tools not always sufficient?

1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013

slide-15
SLIDE 15

Background: Existing Debugging Tools

◮ General purpose:

◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection

◮ Memory manager specific tools:

◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool

◮ Other system & language specific tools:

◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)

So why are these tools not always sufficient?

◮ Source vs binary level information

1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013

slide-16
SLIDE 16

Background: Existing Debugging Tools

◮ General purpose:

◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection

◮ Memory manager specific tools:

◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool

◮ Other system & language specific tools:

◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)

So why are these tools not always sufficient?

◮ Source vs binary level information ◮ Inspection vs bug detection

1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013

slide-17
SLIDE 17

Background: Existing Debugging Tools

◮ General purpose:

◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection

◮ Memory manager specific tools:

◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool

◮ Other system & language specific tools:

◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)

So why are these tools not always sufficient?

◮ Source vs binary level information ◮ Inspection vs bug detection ◮ Language compatibility

1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013

slide-18
SLIDE 18

Our Focus: Distinguishing Data and Meta Data

◮ Want to codify memory layout - which addresses correspond

to:

◮ meta data - object header bits, free list, etc. ◮ data - allocated objects

◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:

◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data

Normal reads and writes:

1Detection tool for memory related bugs (Seward & Nethercote 2005)

slide-19
SLIDE 19

Our Focus: Distinguishing Data and Meta Data

◮ Want to codify memory layout - which addresses correspond

to:

◮ meta data - object header bits, free list, etc. ◮ data - allocated objects

◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:

◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data

Code with bug(s) distinguishing data / meta data:

1Detection tool for memory related bugs (Seward & Nethercote 2005)

slide-20
SLIDE 20

Our Focus: Distinguishing Data and Meta Data

◮ Want to codify memory layout - which addresses correspond

to:

◮ meta data - object header bits, free list, etc. ◮ data - allocated objects

◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:

◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data

Subtleties - e.g. some application code can access meta data:

1Detection tool for memory related bugs (Seward & Nethercote 2005)

slide-21
SLIDE 21

Our Focus: Distinguishing Data and Meta Data

◮ Want to codify memory layout - which addresses correspond

to:

◮ meta data - object header bits, free list, etc. ◮ data - allocated objects

◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:

◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data

Solution - mediate special cases with read / write barriers:

1Detection tool for memory related bugs (Seward & Nethercote 2005)

slide-22
SLIDE 22

Memory Management Bugs

◮ Causes of some memory related bugs:

◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity

◮ GC correctness bug symptoms include . . .

Example heap:

slide-23
SLIDE 23

Memory Management Bugs

◮ Causes of some memory related bugs:

◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity

◮ GC correctness bug symptoms include . . .

◮ Use after free - object incorrectly freed

Heap with possible use-after free:

slide-24
SLIDE 24

Memory Management Bugs

◮ Causes of some memory related bugs:

◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity

◮ GC correctness bug symptoms include . . .

◮ Use after free - object incorrectly freed ◮ Memory leak - object incorrectly retained

Heap with memory leak:

slide-25
SLIDE 25

Memory Management Bugs

◮ Causes of some memory related bugs:

◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity

◮ GC correctness bug symptoms include . . .

◮ Use after free - object incorrectly freed ◮ Memory leak - object incorrectly retained ◮ Memory corruption - overwriting memory

Corrupted heap:

slide-26
SLIDE 26

Memory Management Bugs

◮ Causes of some memory related bugs:

◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity

◮ GC correctness bug symptoms include . . .

◮ Use after free - object incorrectly freed ◮ Memory leak - object incorrectly retained ◮ Memory corruption - overwriting memory ◮ Altered control flow - incorrect code executing

Altered control flow - heap implications:

slide-27
SLIDE 27

Our Approach: Permchecker - A Memory Permissions API

◮ Shadow each byte of memory with r/w permissions ◮ Dynamic verification of every load/store/modify machine

instruction

init_map (int ID) destroy_map (int ID) set_byte (int ID, void* addr, U8 value) get_byte (int ID, void* addr) get_bit (int ID, void* addr, int offset) unmark_bit (int ID, void* addr, int offset) mark_bit (int ID, void* addr, int offset) turn_off_map (int ID) turn_on_map (int ID) is_map_on (int ID) new_function (char* descr, void* addr, int size)

◮ Small handful of API calls

inserted into your memory management system

◮ Entire system gets verified! ◮ Shadow data in one map,

meta data in another

slide-28
SLIDE 28

Shadow Memory Implementation

◮ Unoptimized re-implementation of Memcheck ◮ Can map entire 32-bit address space ◮ O(1) shadow map lookup and update

xkcd.com/1369

slide-29
SLIDE 29

Target JVM - The Jikes RVM

◮ Non-trivial use case:

◮ Self bootstrapped

JVM

◮ Multiple stacks,

differing layouts

◮ Garbage collection ◮ JIT compiler

◮ Active MM

community

◮ Modularity

slide-30
SLIDE 30

Example #1: Use After Free Bug

◮ Bug where garbage collector incorrectly frees an object ◮ Can encode permissions in Permchecker by:

◮ Marking allocated objects as readable ◮ Marking freed objects unreadable

slide-31
SLIDE 31

Example #1: Use After Free Bug

◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:

public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }

slide-32
SLIDE 32

Example #1: Use After Free Bug

◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:

◮ Mark-sweep fails to trace these objects public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }

slide-33
SLIDE 33

Example #1: Use After Free Bug

◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:

◮ Mark-sweep fails to trace these objects ◮ Object gets freed (memory zeroed out) public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }

slide-34
SLIDE 34

Example #1: Use After Free Bug

◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:

◮ Mark-sweep fails to trace these objects ◮ Object gets freed (memory zeroed out) ◮ Application still has reference - use after free occurs public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }

slide-35
SLIDE 35

Example #1: Use After Free Bug

◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:

◮ Mark-sweep fails to trace these objects ◮ Object gets freed (memory zeroed out) ◮ Application still has reference - use after free occurs public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } } Invalid read on 0x5154981c (size = 4, ID = 0): at LNoTrace;>.main at Lorg/jikesrvm/ia32/OutOfLineMachineCode;> .<init>()V at Lorg/jikesrvm/runtime/Reflection;> .outOfLineInvoke (Lorg/jikesrvm/classloader/RVMMethod ;Ljava/lang/Object ;[Ljava/lang/Object ;Z)Ljava/lang/Object; at Lorg/jikesrvm/runtime/Reflection;> .invoke (Lorg/jikesrvm/classloader/RVMMethod ;Lorg/jikesrvm/runtime/ReflectionBase ;Ljava/lang/Object ;[Ljava/lang/Object ;Z)Ljava/lang/Object; at Lorg/jikesrvm/scheduler/MainThread;>.run()V at Lorg/jikesrvm/scheduler/RVMThread;>.run()V at Lorg/jikesrvm/scheduler/RVMThread;>.startoff()V

slide-36
SLIDE 36

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation

slide-37
SLIDE 37

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized

slide-38
SLIDE 38

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad!

slide-39
SLIDE 39

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized

elements

slide-40
SLIDE 40

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized

elements

◮ VM boots, grey = readable but not

writeable

slide-41
SLIDE 41

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized

elements

◮ VM boots, grey = readable but not

writeable

◮ New spaces allocated - more entries

added to map

slide-42
SLIDE 42

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized

elements

◮ VM boots, grey = readable but not

writeable

◮ New spaces allocated - more entries

added to map

◮ map[m] uninitialized - enable write

permissions

slide-43
SLIDE 43

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized

elements

◮ VM boots, grey = readable but not

writeable

◮ New spaces allocated - more entries

added to map

◮ map[m] uninitialized - enable write

permissions

◮ Not allowed to overwrite initialized

element - corruption!

slide-44
SLIDE 44

Example #2: Meta Data Bug

◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized

elements

◮ VM boots, grey = readable but not

writeable

◮ New spaces allocated - more entries

added to map

◮ map[m] uninitialized - enable write

permissions

◮ Not allowed to overwrite initialized

element - corruption!

Invalid write on 0x4101e474 (size = 4) at Lorg/mmtk/utility/heap/Map;> .insert (Lorg/vmmagic/unboxed/Address ;Lorg/vmmagic/unboxed/Extent ;ILorg/mmtk/policy/Space ;)V at Lorg/mmtk/utility/heap/Map;> .allocateContiguousChunks (ILorg/mmtk/policy/Space ;ILorg/vmmagic/unboxed/Address ;)Lorg/vmmagic/unboxed/Address ...

slide-45
SLIDE 45

Permchecker Performance

◮ Main overhead - Valgrind instrumentation ◮ Slowdown:

◮ ≈ 90x in Memcheck (measured1) ◮ ≈ 50x in Permchecker (measured1) ◮ ≈ 1.1x with Mondrian memory (reported2) 1This work 2Witchel & Asanovic 2002 - Mondrian Memory Protection

slide-46
SLIDE 46

Research Opportunity: Declarative Permissions

◮ Java annotations and

macros for Java-based managers

◮ Codifying the Data - Meta

Data distinction

class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }

slide-47
SLIDE 47

Research Opportunity: Declarative Permissions

◮ Java annotations and

macros for Java-based managers

◮ Codifying the Data - Meta

Data distinction

◮ Studying invariants in

structure of memory managed systems:

class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }

slide-48
SLIDE 48

Research Opportunity: Declarative Permissions

◮ Java annotations and

macros for Java-based managers

◮ Codifying the Data - Meta

Data distinction

◮ Studying invariants in

structure of memory managed systems:

◮ Memory layout - how do

we safely operate on implicit memory layouts?

class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }

slide-49
SLIDE 49

Research Opportunity: Declarative Permissions

◮ Java annotations and

macros for Java-based managers

◮ Codifying the Data - Meta

Data distinction

◮ Studying invariants in

structure of memory managed systems:

◮ Memory layout - how do

we safely operate on implicit memory layouts?

◮ Code structure - which

methods operate on which regions of memory?

class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }

slide-50
SLIDE 50

Conclusions

◮ We prototyped a promising new debugging tool ◮ Exciting research opportunities in systems debugging &

verification

slide-51
SLIDE 51

Conclusions

◮ We prototyped a promising new debugging tool ◮ Exciting research opportunities in systems debugging &

verification

◮ Acknowledgements:

◮ Sam for research ideas and discussions ◮ Nathan and Diogenes for help with Jikes ◮ Mike, Raoul, and Moses for research discussions and feedback

Questions?

Hovertext: MY RESULTS ARE A SIGNIFICANT IMPROVEMENT ON THE STATE OF THE AAAAAAAAAAAART (xkcd.com/1403)