Hash Tables LAST TODAY NEXT Hashing Unbounded arrays - - PowerPoint PPT Presentation

hash tables last today next hashing unbounded arrays
SMART_READER_LITE
LIVE PREVIEW

Hash Tables LAST TODAY NEXT Hashing Unbounded arrays - - PowerPoint PPT Presentation

Hash Tables LAST TODAY NEXT Hashing Unbounded arrays Implementing Genericity Amortized analysis Hash tables Introduction to C1 (genericity) Implicit contract for casting (void*) x where x has type tp* //@ensures


slide-1
SLIDE 1

Hash Tables

slide-2
SLIDE 2

LAST Unbounded arrays Amortized analysis TODAY

  • Hashing
  • Genericity

NEXT Implementing Hash tables

slide-3
SLIDE 3

Introduction to C1 (genericity)

slide-4
SLIDE 4

Implicit contract for casting

  • (void*) x where x has type tp*


//@ensures \hastag(tp*, x)

  • (tp*) y, where y has type void*


//@requires \hastag(tp*, y)

slide-5
SLIDE 5

Only operations you allowed on p of type void*

  • Cast to another type:(int*) p
  • Compare to another void* value: p == q where q

is of type void*

  • Compare to NULL: p == NULL
slide-6
SLIDE 6

Hashing

slide-7
SLIDE 7

Reflecting on arrays

  • As a way to keep a collection of elements of the same

type, like a set

  • As a mapping from indices to values like a dictionary
  • Operations: insert, lookup

goal: make these operations efficient

{

slide-8
SLIDE 8

Dictionaries (also known as maps, associative arrays)

  • An array is a mapping from indices to elements where 


A[i] = e.

  • Dictionary: mapping from keys to entries where key can be

any kind of information

  • zipcode (key) to neighborhood name (entry)
  • Andrew id (key) to home address (entry)
  • SSN (key) to tax id (entry)

key entry

slide-9
SLIDE 9

Implementing dictionaries

unsorted (key,entry) array (key, entry) array sorted by key linked list with (key,entry) data

lookup insert O(n) O(log n) O(n)

O(1) amortized

O(n) O(1)

Can we implement dictionaries such that both lookup and insert are about O(1)?

slide-10
SLIDE 10

Example: Storing zipcodes using an array with length 5

Some fun zip codes: 90210 Beverly Hills 10101 New York 20500 White House 44444 Newton Falls, OH 94043 Googleplex 15213 CMU 15217 Squirrel Hill 15122 Kennywood

key value

1 2 3 4

slide-11
SLIDE 11

Example: Storing zipcodes using an array with length 5

Some fun zip codes: 90210 Beverly Hills 10101 New York 20500 White House 44444 Newton Falls, OH 94043 Googleplex 15213 CMU 15217 Squirrel Hill 15122 Kennywood

hash value key index

key value

zipcode zipcode % 5 zipcode % 5

slide-12
SLIDE 12

Design choices for handling collisions

  • Open addressing (e.g. linear probing)
  • Separate chaining
slide-13
SLIDE 13

Example: linear probing

Look for an empty slot somewhere predictable: next position, then next-next … 1 2 3 4

15217 Squirrel Hill 20500 White House 90210 Beverly Hills 10101 New York

“Squirrel Hill” “White House”

“Beverly Hills”

“New York”

slide-14
SLIDE 14

Example: linear probing

How do you know something is not in the table? 1 2 3 4

15217 Squirrel Hill 20500 White House 90210 Beverly Hills 10101 New York

“Squirrel Hill” “White House”

“Beverly Hills”

“New York”

slide-15
SLIDE 15

Example: separate chaining

1 2 3 4

slide-16
SLIDE 16

Cost analysis of separate chaining

If we have an array of size m and a total of n entries, how much does it take to lookup an entry?

slide-17
SLIDE 17

Worst possible layout

1 2 3 4 … m n O(n)

slide-18
SLIDE 18

Best possible layout

1 2 3 4 … … … … … m n/m O(n/m)

slide-19
SLIDE 19

Cost analysis of separate chaining

Can we arrange so that n/m is constant? use resizing as we did in unbounded arrays

slide-20
SLIDE 20

Implementing dictionaries

unsorted (key,value) array (key, value) array sorted by key linked list with (key,value) data Hash tables

lookup insert O(n) O(log n) O(n)

O(1) amortized

O(n) O(1) O(n/m) Average O(1) average and amortized O(n/m) Average O(1) average and amortized