Debugging a Memory Manager Karl Cronburg karl@cs.tufts.edu Tufts - - PowerPoint PPT Presentation
Debugging a Memory Manager Karl Cronburg karl@cs.tufts.edu Tufts - - PowerPoint PPT Presentation
Debugging a Memory Manager Karl Cronburg karl@cs.tufts.edu Tufts University The Problem How do we guarantee correctness of a memory management system? Difficulties include: Complex garbage collection (GC) algorithms Static analysis
The Problem
How do we guarantee correctness of a memory management system? Difficulties include:
◮ Complex garbage collection (GC) algorithms ◮ Static analysis computationally infeasible ◮ Loss of type information ◮ Implicit memory layouts
(only described in code comments)
◮ Pointer safety
xkcd.com/138
Motivation
◮ Growing popularity of memory-safe systems ◮ Someone has to implement and debug these systems ◮ Ensuring that the memory manager
◮ respects application-system boundaries ◮ handles its own memory appropriately
◮ It matters which code is touching which parts of memory and
when
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
◮ Time consuming & tedious - custom analysis of log files
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files ◮ Ad-hoc - correctness of debugging assertions
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files ◮ Ad-hoc - correctness of debugging assertions ◮ Incomplete - weak guarantee of program correctness
Background: Existing Debugging Techniques
◮ Printf / log-based ◮ Sanity checking / assertions
Why are these techniques unsatisfactory?
◮ Time consuming & tedious - custom analysis of log files ◮ Program slow-down ◮ Disk space - log files ◮ Ad-hoc - correctness of debugging assertions ◮ Incomplete - weak guarantee of program correctness ◮ Lack of isolation
Background: Existing Debugging Tools
◮ General purpose:
◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection 1 2
Background: Existing Debugging Tools
◮ General purpose:
◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection
◮ Memory manager specific tools:
◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool 1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013
Background: Existing Debugging Tools
◮ General purpose:
◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection
◮ Memory manager specific tools:
◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool
◮ Other system & language specific tools:
◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)
So why are these tools not always sufficient?
1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013
Background: Existing Debugging Tools
◮ General purpose:
◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection
◮ Memory manager specific tools:
◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool
◮ Other system & language specific tools:
◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)
So why are these tools not always sufficient?
◮ Source vs binary level information
1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013
Background: Existing Debugging Tools
◮ General purpose:
◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection
◮ Memory manager specific tools:
◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool
◮ Other system & language specific tools:
◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)
So why are these tools not always sufficient?
◮ Source vs binary level information ◮ Inspection vs bug detection
1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013
Background: Existing Debugging Tools
◮ General purpose:
◮ Valgrind / Memcheck - dynamic binary instrumentation ◮ GDB - breakpoints, instruction stepping, memory inspection
◮ Memory manager specific tools:
◮ RDB1 - GDB-like JVM debugger ◮ Elephant Tracks2 - log-based JVM inspection tool
◮ Other system & language specific tools:
◮ Printf Debugger (for C) ◮ Various IDE plugins (e.g. for Eclipse)
So why are these tools not always sufficient?
◮ Source vs binary level information ◮ Inspection vs bug detection ◮ Language compatibility
1Makarov & Hauswirth 2013 2Ricci, Guyer, Moss 2013
Our Focus: Distinguishing Data and Meta Data
◮ Want to codify memory layout - which addresses correspond
to:
◮ meta data - object header bits, free list, etc. ◮ data - allocated objects
◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:
◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data
Normal reads and writes:
1Detection tool for memory related bugs (Seward & Nethercote 2005)
Our Focus: Distinguishing Data and Meta Data
◮ Want to codify memory layout - which addresses correspond
to:
◮ meta data - object header bits, free list, etc. ◮ data - allocated objects
◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:
◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data
Code with bug(s) distinguishing data / meta data:
1Detection tool for memory related bugs (Seward & Nethercote 2005)
Our Focus: Distinguishing Data and Meta Data
◮ Want to codify memory layout - which addresses correspond
to:
◮ meta data - object header bits, free list, etc. ◮ data - allocated objects
◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:
◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data
Subtleties - e.g. some application code can access meta data:
1Detection tool for memory related bugs (Seward & Nethercote 2005)
Our Focus: Distinguishing Data and Meta Data
◮ Want to codify memory layout - which addresses correspond
to:
◮ meta data - object header bits, free list, etc. ◮ data - allocated objects
◮ Which methods can operate on a specific piece of memory ◮ Memcheck1is close to what we want, however:
◮ Distinguishes allocated & unallocated data ◮ Doesn’t distinguish data and meta data
Solution - mediate special cases with read / write barriers:
1Detection tool for memory related bugs (Seward & Nethercote 2005)
Memory Management Bugs
◮ Causes of some memory related bugs:
◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity
◮ GC correctness bug symptoms include . . .
Example heap:
Memory Management Bugs
◮ Causes of some memory related bugs:
◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity
◮ GC correctness bug symptoms include . . .
◮ Use after free - object incorrectly freed
Heap with possible use-after free:
Memory Management Bugs
◮ Causes of some memory related bugs:
◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity
◮ GC correctness bug symptoms include . . .
◮ Use after free - object incorrectly freed ◮ Memory leak - object incorrectly retained
Heap with memory leak:
Memory Management Bugs
◮ Causes of some memory related bugs:
◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity
◮ GC correctness bug symptoms include . . .
◮ Use after free - object incorrectly freed ◮ Memory leak - object incorrectly retained ◮ Memory corruption - overwriting memory
Corrupted heap:
Memory Management Bugs
◮ Causes of some memory related bugs:
◮ Explicit memory handling (pointers) ◮ Implicit memory layout (e.g. implicit headers) ◮ GC algorithm complexity
◮ GC correctness bug symptoms include . . .
◮ Use after free - object incorrectly freed ◮ Memory leak - object incorrectly retained ◮ Memory corruption - overwriting memory ◮ Altered control flow - incorrect code executing
Altered control flow - heap implications:
Our Approach: Permchecker - A Memory Permissions API
◮ Shadow each byte of memory with r/w permissions ◮ Dynamic verification of every load/store/modify machine
instruction
init_map (int ID) destroy_map (int ID) set_byte (int ID, void* addr, U8 value) get_byte (int ID, void* addr) get_bit (int ID, void* addr, int offset) unmark_bit (int ID, void* addr, int offset) mark_bit (int ID, void* addr, int offset) turn_off_map (int ID) turn_on_map (int ID) is_map_on (int ID) new_function (char* descr, void* addr, int size)
◮ Small handful of API calls
inserted into your memory management system
◮ Entire system gets verified! ◮ Shadow data in one map,
meta data in another
Shadow Memory Implementation
◮ Unoptimized re-implementation of Memcheck ◮ Can map entire 32-bit address space ◮ O(1) shadow map lookup and update
xkcd.com/1369
Target JVM - The Jikes RVM
◮ Non-trivial use case:
◮ Self bootstrapped
JVM
◮ Multiple stacks,
differing layouts
◮ Garbage collection ◮ JIT compiler
◮ Active MM
community
◮ Modularity
Example #1: Use After Free Bug
◮ Bug where garbage collector incorrectly frees an object ◮ Can encode permissions in Permchecker by:
◮ Marking allocated objects as readable ◮ Marking freed objects unreadable
Example #1: Use After Free Bug
◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:
public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }
Example #1: Use After Free Bug
◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:
◮ Mark-sweep fails to trace these objects public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }
Example #1: Use After Free Bug
◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:
◮ Mark-sweep fails to trace these objects ◮ Object gets freed (memory zeroed out) public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }
Example #1: Use After Free Bug
◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:
◮ Mark-sweep fails to trace these objects ◮ Object gets freed (memory zeroed out) ◮ Application still has reference - use after free occurs public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } }
Example #1: Use After Free Bug
◮ References to NoTraceObject are ‘accidentally’ freed by GC ◮ Sequence of events:
◮ Mark-sweep fails to trace these objects ◮ Object gets freed (memory zeroed out) ◮ Application still has reference - use after free occurs public class NoTrace { public static void main(String[] args) { NoTraceObject ntr = new NoTraceObject(1); ntr.next = new NoTraceObject(2); System.gc(); System.out.println(ntr.next.x); } } class NoTraceObject { int x; NoTraceObject next; NoTraceObject(int x) { this.x = x; } } Invalid read on 0x5154981c (size = 4, ID = 0): at LNoTrace;>.main at Lorg/jikesrvm/ia32/OutOfLineMachineCode;> .<init>()V at Lorg/jikesrvm/runtime/Reflection;> .outOfLineInvoke (Lorg/jikesrvm/classloader/RVMMethod ;Ljava/lang/Object ;[Ljava/lang/Object ;Z)Ljava/lang/Object; at Lorg/jikesrvm/runtime/Reflection;> .invoke (Lorg/jikesrvm/classloader/RVMMethod ;Lorg/jikesrvm/runtime/ReflectionBase ;Ljava/lang/Object ;[Ljava/lang/Object ;Z)Ljava/lang/Object; at Lorg/jikesrvm/scheduler/MainThread;>.run()V at Lorg/jikesrvm/scheduler/RVMThread;>.run()V at Lorg/jikesrvm/scheduler/RVMThread;>.startoff()V
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad!
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized
elements
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized
elements
◮ VM boots, grey = readable but not
writeable
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized
elements
◮ VM boots, grey = readable but not
writeable
◮ New spaces allocated - more entries
added to map
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized
elements
◮ VM boots, grey = readable but not
writeable
◮ New spaces allocated - more entries
added to map
◮ map[m] uninitialized - enable write
permissions
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized
elements
◮ VM boots, grey = readable but not
writeable
◮ New spaces allocated - more entries
added to map
◮ map[m] uninitialized - enable write
permissions
◮ Not allowed to overwrite initialized
element - corruption!
Example #2: Meta Data Bug
◮ Internal JVM (meta) data structure gets corrupted ◮ ‘Order of operations’ violation ◮ Some element in array gets initialized ◮ Element gets overwritten - bad! ◮ Solution: disallow write on initialized
elements
◮ VM boots, grey = readable but not
writeable
◮ New spaces allocated - more entries
added to map
◮ map[m] uninitialized - enable write
permissions
◮ Not allowed to overwrite initialized
element - corruption!
Invalid write on 0x4101e474 (size = 4) at Lorg/mmtk/utility/heap/Map;> .insert (Lorg/vmmagic/unboxed/Address ;Lorg/vmmagic/unboxed/Extent ;ILorg/mmtk/policy/Space ;)V at Lorg/mmtk/utility/heap/Map;> .allocateContiguousChunks (ILorg/mmtk/policy/Space ;ILorg/vmmagic/unboxed/Address ;)Lorg/vmmagic/unboxed/Address ...
Permchecker Performance
◮ Main overhead - Valgrind instrumentation ◮ Slowdown:
◮ ≈ 90x in Memcheck (measured1) ◮ ≈ 50x in Permchecker (measured1) ◮ ≈ 1.1x with Mondrian memory (reported2) 1This work 2Witchel & Asanovic 2002 - Mondrian Memory Protection
Research Opportunity: Declarative Permissions
◮ Java annotations and
macros for Java-based managers
◮ Codifying the Data - Meta
Data distinction
class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }
Research Opportunity: Declarative Permissions
◮ Java annotations and
macros for Java-based managers
◮ Codifying the Data - Meta
Data distinction
◮ Studying invariants in
structure of memory managed systems:
class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }
Research Opportunity: Declarative Permissions
◮ Java annotations and
macros for Java-based managers
◮ Codifying the Data - Meta
Data distinction
◮ Studying invariants in
structure of memory managed systems:
◮ Memory layout - how do
we safely operate on implicit memory layouts?
class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }
Research Opportunity: Declarative Permissions
◮ Java annotations and
macros for Java-based managers
◮ Codifying the Data - Meta
Data distinction
◮ Studying invariants in
structure of memory managed systems:
◮ Memory layout - how do
we safely operate on implicit memory layouts?
◮ Code structure - which
methods operate on which regions of memory?
class MyAllocator { // Allocation Address allocate(int size) { Address result; /* ... Allocation code ... */ // Mark object portion of allocation as data: @ShadowData (result + 4, size) // Mark header portion of allocation as metadata: @ShadowMetadata(result, 4) return (result + 4); } void free(Address a, int size) { @ShadowFree(a - 4, size + 4) /* ... Freeing code ... */ } // Header access @Inline @readMeta Header readHeader(Address a) { return a.load(); } }
Conclusions
◮ We prototyped a promising new debugging tool ◮ Exciting research opportunities in systems debugging &
verification
Conclusions
◮ We prototyped a promising new debugging tool ◮ Exciting research opportunities in systems debugging &
verification
◮ Acknowledgements:
◮ Sam for research ideas and discussions ◮ Nathan and Diogenes for help with Jikes ◮ Mike, Raoul, and Moses for research discussions and feedback
Questions?
Hovertext: MY RESULTS ARE A SIGNIFICANT IMPROVEMENT ON THE STATE OF THE AAAAAAAAAAAART (xkcd.com/1403)