Python MyHashTable class: search method with linear probing

Newbie picture Newbie · Oct 11, 2015 · Viewed 7.6k times · Source

I need help implementing a method for my "MyHashTable" class:

def search(self, search_key):

The method is supposed to use linear probing to handle collision resolution. If the search_key is in the hash table then the method returns the slot number of the slot containing that search_key. If the search_key is not in the hash table, the method returns -1

My class looks like this:

class MyHashTable:

def __init__(self, capacity):
    self.capacity = capacity
    self.slots = [None] * self.capacity

def __str__(self):
    return str(self.slots )

def __len__(self):
    count = 0
    for i in self.slots:
        if i != None:
            count += 1
    return count

def hash_function(self, key):
    i = key % self.capacity
    return i

def insert(self, key):
    slot = self.hash_function(key)
    orig = slot
    while True:
        if self.slots[slot] is None:
            self.slots[slot] = key
            return slot

        if self.slots[slot] == key:
            return -2

        slot = (slot + 1) % self.capacity
        if slot == orig:
            return -1

def search(self, search_key):

Any help or tutorial links would be awesome. Thanks

Answer

Padraic Cunningham picture Padraic Cunningham · Oct 11, 2015

You are only using a single list to store all the values, if you wanted a hash table you might use a list of lists where each list was a bucket but if you just want to check if the element is in your hash table with your own code:

def search(self, search_key):
    hsh = self.hash_function(search_key)
    if self.slots[hsh] is None:
        return -1
    while hsh < self.capacity:
        if self.slots[hsh] == search_key:
            return hsh
        hsh += 1
    return -1

You also have to handle the case where you have multiple collisions so we need at worst to check every element in the hash table to find the correct value:

 def search(self, search_key):
    hsh = self.hash_function(search_key)
    if self.slots[hsh] is None:
        return -1
    for i in range(self.capacity):
        mod = (hsh + i) % self.capacity
        if self.slots[mod] == search_key:
            return mod
    return -1

The first while loop will probe one value over at a time but if we have wrapped around the list from multiple collisions it would miss elements at the start so using range and mod = (hsh + i) % self.capacity makes sure we check all entries like the example below.

m = MyHashTable(5)

m.insert(13) # 13 % 5 = 3
m.insert(73) # 83 % 5 = 3
m.insert(93) # 93 & 5 = 3
print(m.search(13)) # 3
print(m.search(73)) # 4
print(m.search(93)) # 0
print(m.search(2)) # -1

You can make your len method O(1) by keeping track of when you add a unique value to your hash table, there is also a nice wiki page on Open_addressing parts of which you can adopt into your code and it will help you create a proper mapping of keys to values and resized your hash table when needed. If you want to store more than just numbers you need to use a different hash function, I just use hash but you can use whatever you like. Also using in when your hash table is full and the key does not exist will cause an infinite loop so you will need to handle that case:

class MyHashTable:
    def __init__(self, capacity):
        self.capacity = capacity
        self.slots = [None] * self.capacity
        self.count = 0

    def __str__(self):
        return str(self.slots)

    def __contains__(self, item):
        return self.search(item) != -1

    def __len__(self):
        return self.count

    def hash_function(self, key):
        return hash(key) % self.capacity

    def find_slot(self, key):
        slot = self.hash_function(key)
        while self.slots[slot] is not None and self.slots[slot] != key:
            slot = (slot + 1) % self.capacity
        return slot

    def insert(self, key):
        slot = self.find_slot(key)
        if self.slots[slot] != key:
            self.slots[slot] = key
            self.count += 1

    def search(self, key):
        i = self.find_slot(key)
        if self.slots[i] is not None:  
            return i
        return -1

Add a __contains__ will also allow you to use in to test for membership:

m = MyHashTable(5)

m.insert("foo")
m.insert(73)
m.insert(93)
m.insert(1)
print(m.search(73))
print(m.search(93))
print(m.search(1))
print(m.search("foo"))
m.insert(73)
print(m.slots)
print(len(m))
print("foo" in m)
print(5 in m)

Output:

3
4
1
0
['foo', 1, None, 73, 93]
4
True
False