4 August 2020

Hashing with Chaining

by mo


Direct Access table

  ----------
0 |        |
  ----------
1 |        |
  ----------
2 |        |
  ----------
3 |        |
  ----------
4 |        |
  ----------
  | ...    |
  ----------
  | ...    |
  ----------
  | ...    |
  ----------

Hash requires mapping to an integer.

Ideally, hash(x) == hash(y) only occurs when x == y.

badness:

  1. keys may not be non negative integers
  2. gigantic memory hog
  • reduce universe of all possible keys and reduce them down to some reasonable set of integers.
  • idea: m = theta(n)
    • size of table proportial to size of # of things.
    -----------
0   | | | | | |
    ----------
1   |         |
    ----------
2   |         |
    ----------
3   |         |
    ----------
4   |         |
    ----------
    | ...     |
    ----------
    | ...     |
    ----------
m-1 | ...     |
    ----------

Collision: Multiple keys map to same slot in hash table.

  • h(ki) == h(kj) but ki <> kj
  • we can use chaining

How to define a hash function?

  • function h(k)
  • all keys map to h = { 0, ..., m-1 }

Two ways to deal with collisions:

  • Chaining: store collisions as a list. (i.e. linked list)
  • Open addressing

Chaining

    ----------
0   |         | --> (item1) --> (item2) --> (nil)
    ----------
1   |         |
    ----------
2   |         |
    ----------
3   |         |
    ----------
4   |         |
    ----------
    | ...     |
    ----------
    | ...     |
    ----------
m-1 | ...     |
    ----------

Worst case:

  • theta(n) (i.e. all items have a collision)

Simple uniform hashing:

  • Convenient for analysis.
  • each key is equally likely to be hashed to any slot of the table, independent of where other keys hashing.

Analysis:

  • expected length of a chain for n keys, m slots.
    • n/m == alpha == load factor
    • O(1 + alpha)

Hash functions:

  1. Division method
  2. Multiplication method
  3. Universal hasing

Example of Hash chaining using the division method for the hash function.

class Node
  attr_reader :next, :data

  def initialize(data)
    @data = data
  end

  def add(data)
    if self.next
      self.next.add(data)
    else
      @next = self.class.new(data)
    end
  end

  def search(&block)
    return self if block.call(data)

    @next&.search(&block)
  end
end

class HashTable
  def initialize(m: 13)
    @m, @table = m, Array.new(m)
  end

  def [](k)
    node = @table[h(k)]
    return if node.nil?

    found = node.search { |(key, value)| key == k }
    return found.data[-1] if found
  end

  def []=(k, value)
    hash = h(k)
    if @table[hash]
      @table[hash].add([k, value])
    else
      @table[hash] = Node.new([k, value])
    end
  end

  private

  def h(k)
    k % @m
  end
end

h = HashTable.new
h[8] = 'hello'
h[21] = 'jello'

puts [8, h[8]].inspect
puts [21, h[21]].inspect

For another example check out my COMP-272 assignment.

algorithms 💎