A Parallel Compact Hash Table Alfons Laarman & Steven van der - - PowerPoint PPT Presentation
A Parallel Compact Hash Table Alfons Laarman & Steven van der - - PowerPoint PPT Presentation
A Parallel Compact Hash Table Alfons Laarman & Steven van der Vegt Overview Research Motivation Background Contribution A Parallel Compact Hash Table October 3, 2011 2 / 19 Introduction Hash tables are fundamental data structures A
Overview
Research Motivation Background Contribution
A Parallel Compact Hash Table October 3, 2011 2 / 19
Introduction
◮ Hash tables are fundamental data structures
A Parallel Compact Hash Table October 3, 2011 3 / 19
Introduction
◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables
A Parallel Compact Hash Table October 3, 2011 3 / 19
Introduction
◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables ◮ Useful in i.e. Model checking, planning, BDDs, Tree tables
A Parallel Compact Hash Table October 3, 2011 3 / 19
Introduction
◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables ◮ Useful in i.e. Model checking, planning, BDDs, Tree tables ◮ Problem: No concurrent implementation of concurrent
hash tables
A Parallel Compact Hash Table October 3, 2011 3 / 19
Introduction
◮ Hash tables are fundamental data structures ◮ Compact hash tables: memory efficient hash tables ◮ Useful in i.e. Model checking, planning, BDDs, Tree tables ◮ Problem: No concurrent implementation of concurrent
hash tables
◮ Our contribution: A scalable lockless algorithm for
compact hashing
A Parallel Compact Hash Table October 3, 2011 3 / 19
Goals
◮ Parallel compact hash table ◮ Scalable
◮ Fast: lockless ◮ Memory efficient: no pointers (otherwise we lose the
benefits from compact hashing)
◮ Focus on findOrPut
◮ Already sufficient Model checking (monotonic growing
dataset)
◮ subsumes individual find and put operations A Parallel Compact Hash Table October 3, 2011 4 / 19
Overview
Research Motivation Background Contribution
A Parallel Compact Hash Table October 3, 2011 5 / 19
Hashing Revisited
◮ A hash table stores a subset of a key universe U into an
table T of buckets typically |U| ≫ |T|
◮ Multiple keys can be mapped upon 1 bucket ◮ The full key is stored in T to resolve collisions ◮ Several possible collision resolution algorithms, i.e. linear
probing
A Parallel Compact Hash Table October 3, 2011 6 / 19
Hashing Revisited - Example
keys
John Smith Lisa Smith Sam Doe Sandra Dee T ed Baker
buckets
000 001 Lisa Smith 521-8976 002 : : : 151 152 John Smith 521-1234 153 Sandra Dee 521-9655 154 T ed Baker 418-4165 155 : : : 253 254 Sam Doe 521-5030 255
Figure: Example of an open addressing hash table.
A Parallel Compact Hash Table October 3, 2011 7 / 19
Introduction Into Compact Hash Tables
◮ If however |U| ≤ |T|, we only need a bit array! (and a
perfect hash function)
◮ What if |U| just slightly bigger than |T|? Cleary Tables:
- 1. Maintain order in T
- 2. Add three bits to buckets in T
A Parallel Compact Hash Table October 3, 2011 8 / 19
Introduction Into BLP
Let K be the set of possible keys and h the hash function which computes the indexes. h : K → {0..M − 1} with the property K1, K2 ∈ K|K1 ≤ L2iff h(K1) ≤ h(K2)
◮ All keys are stored in ascending order. ◮ There can not be empty locations between a keys original
hash location and its actual storage position.
◮ All keys sharing the same initial hash location form one
continuous group.
◮ Groups can grow together forming clusters of groups. ◮ Bidirectional linear probing algorithm (probing possible in
both directions)
A Parallel Compact Hash Table October 3, 2011 9 / 19
Introduction Into BLP - Insert Example
Inserting k into table T in 5 steps:
- 1. Determine index: i ← h(k)
- 2. Determine probing direction T[h(k)] > k?right : left
- 3. Search empty bucket
- 4. Insert K into empty bucket
- 5. Swap bucket into correct place
A Parallel Compact Hash Table October 3, 2011 10 / 19
Cleary Table
Cleary administration bits:
◮ Virgin Set upon a bucket if its location is the initial hash
location for some key in the tables
◮ Change Set at the beginning of a group with the same
initial hash location
◮ Occupied Set if the bucket contains a key
A Parallel Compact Hash Table October 3, 2011 11 / 19
Cleary Table - Example
Figure: Example of a partially filled Cleary table with 4 groups.
A Parallel Compact Hash Table October 3, 2011 12 / 19
Overview
Research Motivation Background Contribution
A Parallel Compact Hash Table October 3, 2011 13 / 19
Requirements for Parallelizing
We need a write-exclusive locking mechanism that
◮ Scales well ◮ Is memory efficient
A Parallel Compact Hash Table October 3, 2011 14 / 19
Locking Mechanism
Properties:
◮ 1 bit per bucket
A Parallel Compact Hash Table October 3, 2011 15 / 19
Locking Mechanism
Properties:
◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap (if a == b then a ← c)
A Parallel Compact Hash Table October 3, 2011 15 / 19
Locking Mechanism
Properties:
◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap (if a == b then a ← c)
Locking steps:
- 1. Search for both left and right bucket of cluster
A Parallel Compact Hash Table October 3, 2011 15 / 19
Locking Mechanism
Properties:
◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap (if a == b then a ← c)
Locking steps:
- 1. Search for both left and right bucket of cluster
- 2. Lock these buckets
A Parallel Compact Hash Table October 3, 2011 15 / 19
Locking Mechanism
Properties:
◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap (if a == b then a ← c)
Locking steps:
- 1. Search for both left and right bucket of cluster
- 2. Lock these buckets
- 3. If one of these locks fails → unlock and start over
A Parallel Compact Hash Table October 3, 2011 15 / 19
Locking Mechanism
Properties:
◮ 1 bit per bucket ◮ CAS(a,b,c) - Compare-and-Swap (if a == b then a ← c)
Locking steps:
- 1. Search for both left and right bucket of cluster
- 2. Lock these buckets
- 3. If one of these locks fails → unlock and start over
- 4. Perform exclusive actions (read, write)
A Parallel Compact Hash Table October 3, 2011 15 / 19
Dynamic Region Based Locking
1: left ← CL-LEFT(h) 2: right ← CL-RIGHT(h) 3: if ¬TRY-LOCK(T[left]) then 4: RESTART 5: if ¬TRY-LOCK(T[right]) then 6: UNLOCK(T[left]) 7: RESTART 8: if FIND(k) then
⊲ exclusive read
9: UNLOCK(T[left], T[right]) 10:
return FOUND
11: PUT(k)
⊲ exclusive write
12: UNLOCK(T[left], T[right])
A Parallel Compact Hash Table October 3, 2011 16 / 19
Benchmarks - Speedup
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Speedup Cores LHT 0:1 LHT 3:1 LHT 9:1 RBL 0:1 RBL 3:1 RBL 9:1 BLP 0:1 BLP 3:1 BLP 9:1 PCT 0:1 PCT 3:1 PCT 9:1 Ideal Speedup
Figure: Speedups of BLP , RBL, LHT and PCT with r/w ratios 0:1, 3:1 and 9:1
A Parallel Compact Hash Table October 3, 2011 17 / 19
Benchmarks - Runtime
0.0 20.0 40.0 60.0 80.0 100.0 120.0 140.0 160.0 180.0 200.0 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70% 75% 80% 85% 90% 95% 100%
normalized runtime load factor LHT 0:1 LHT 3:1 LHT 9:1 RBL 0:1 RBL 3:1 RBL 9:1 BLP 0:1 BLP 3:1 BLP 9:1 PCT 0:1 PCT 3:1 PCT 9:1
Figure: 16-core runtimes of BLP , RBL, LHT and PCT with r/w ratios 0:1, 3:1 and 9:1.
A Parallel Compact Hash Table October 3, 2011 18 / 19
Results
◮ PCT performs very good with only inserts, ◮ PCT’s performance drops when the load-factor becomes
above the 85%
◮ With a high amount of reads ¿ (9:1) BLP eventually
becomes faster than LHT
◮ Region based locking with OS-locks is very slow as can
be seen in RBL
◮ scalability of both PCL and BLP is good. ◮ r/w ratio: r/w exclusion on clusters takes a toll.
there is room for improvement if look at the higher load factors (when clusters are large)
A Parallel Compact Hash Table October 3, 2011 19 / 19
Conclusion
◮ We have realized parallel cleary with high performance
and scalability up to load-factors of 90% Since the compression ratio of compact hash tables can be high, this is acceptable
◮ Future work: Allow for concurrent reads with cleary to
improve scalability of Cleary even more
A Parallel Compact Hash Table October 3, 2011 20 / 19