Allocating memory in a lock-free manner
Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas
Distributed Computing and Systems group, Department of Computer Science and Engineering,
Allocating memory in a lock-free manner Anders Gidenstam, Marina - - PowerPoint PPT Presentation
Allocating memory in a lock-free manner Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers University of Technology Outline
Distributed Computing and Systems group, Department of Computer Science and Engineering,
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
2
Introduction Lock-free synchronization Memory allocators NBmalloc Architecture Data structures Experiments Conclusions
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
3
Lock-free and wait-free synchronization
Concurrent operations without enforcing mutual exclusion Avoids:
Lock-free
Wait-free
Synchronization primitives
Built into CPU and memory system
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
4
Desired semantics of a shared data
Linearizability [Herlihy & Wing, 1990]
O2 O3 O1 O1 O2 O3
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
5
Concurrent memory management Concurrent applications
Why lock-free?
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
6
Provide dynamic memory to the application
Allocate / Deallocate interface
Maintains a pool of memory (a.k.a. heap) Online problem – requests are handled in order Performance
Fragmentation Runtime overhead
Memory address
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
7
Goals Scalability Avoiding
Cache line CPUs
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
8
size-classes Processor heap
SB SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB SB
Processor heap
SB SB SB
SB header
Per-processor heaps
from different places
Fixed set of size classes/allocatable sizes
Superblocks
memory, prevents heap blowup
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
9
1.
2.
3.
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
10
Properties Items can be moved from one
An item can only be in one
Operations Insert Get_any
Insert atomically removes the item from
its old location L-F Set L-F Set Remove Insert
Unless “Remove + Insert” appears atomic an item may get stuck in “limbo”.
Current
Flat-set
Superblock SB header
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
11
Goal: Move a pointer value between two shared pointer locations Requirements The pointer target must stay accessible The same # of shared pointers to the target after the move
as before
Lock-free behaviour Issues One atomic CAS is not enough! We’ll need several steps. Interfering threads need to help unfinished operations
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
12
From
To
New_pos
From
Old_pos
To
Old_pos
From
New_pos
To
Old_pos
Note that some extra details are needed to prevent ABA problems.
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
13
Benchmark applications Larson
Active-false/Passive-false
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
14
Larson benchmark. Sun 4xUltraSPARC III
Speed-up Memory usage
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
15
Larson benchmark. SGI Origin 3800 32(/128)xMIPS
Speed-up Memory usage
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
16
Lock-free memory allocator
Scalable Behaves well on both UMA and NUMA architectures
Lock-free flat-sets
New lock-free data structure Allows lock-free inter-object operations Implementation Freely available (GPL)
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
17
Further development of the memory
Reclaiming superblocks for reuse in a
Improve search strategies for flat-sets Evaluate the memory allocator with real
How to make lock-free composite objects
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
18
Contact Information: Address:
Anders Gidenstam, Computer Science & Engineering, Chalmers University of Technology, SE-412 96 Göteborg, Sweden
Email:
andersg @ cs.chalmers.se
Web:
http://www.cs.chalmers.se/~dcs http://www.cs.chalmers.se/~andersg
Implementation
http://www.cs.chalmers.se/~dcs/nbmalloc.html
2005 Anders Gidenstam, Distributed Computing and Systems, Chalmers
19
#CPUs #Threads Traditional desktop applications Traditional multi- threaded desktop applications Multi-threaded applications on new multicore CPU(s) High performance multi- threaded applications on multiprocessors
1 5