Hashing and Collision Resolution Technique

Apurva Komnak
4 min readJun 1, 2021

--

What is Hashing?

It is a method for storing and retrieving the data from the database in O(1) time (Constant Time). It is also known as a mapping technique because we are trying to map larger values into smaller values.

Terminologies in Hashing

  1. Search Keys
  2. Hash Table
  3. Hash Function

Search Key

Example — [20,35,42,54,63,70]

We need a certain ‘key’ to search for a required value in the database. For example, in the student dataset, we require a key such as a roll number to search for the information of a certain student. This key is known as a Search key.

Hash Table

Hash Table

Just like an array it has an index but if we want to search, insert or delete any data in this structure, we don’t need to scan through the structure, we just need a hash function.

Hash Function

Using a hash function we can search, insert, delete in O(1) time, that is constant time. Examples of Hash Function are k mod 10, k mod n, Mid Square, Folding method.

Using the Hash function we map the Hash key to the position in the Hash table.

For example if K=20 , k mod 10 = 20 mod 10 = 0. Then 20 will be added to the index 0 in the hash table.

Similarly, 35, 42, 54, and 63 will be added to the hash table as follows

In the case of 70, (70 mod 10) = 0 but the 0th index in the hash table is already occupied by 20.

This situation in which two or more Hash key occupies the same index is called as Collision.

There are two types of Collision resolution technique

1.Open Hashing

2.Closed Hashing

In Open Hashing, there is a method called Chaining. Chaining makes use of the link list data structure to store the newly added key to the index where the collision occurs. In this method deletion of a key as deletion is easy in link-list, but its worst-case time complexity of searching as well as deletion is O(n) when all the keys occupy the same index and also it wastes the given space as address of key are stored in link-list not directly in the hash table.

Chaining

In the Closed Hashing technique also known as open addressing, Linear Probing, Quadratic Probing, and Double Hashing are three methods.

In Linear probing the key is added to the next available(empty) index. Here no extra place is required as in chaining, but in the worst case, the searching time is O(n) which is maybe needed to search from o to n-1. There are also issues like primary clustering (all key values are cluster together).

linear Probing

In Quadratic probing the key is added to the [(K mod 10) + i²] mod 10 positions, where ‘i’ is the number of collisions that occurred.

Quadratic Probing

No extra space is occupied as keys are directly stored in the hash table in this method. Also, Primary clustering is resolved. But the worst-case time complexity is the same O(n), also secondary clustering is introduced, and sometimes there is no guarantee of finding a slot for key in the hash table.

In Double Hashing there are two hash functions, for example h1(k) = k mod 11 and h2(k) = 8-(k mod8) and result is stored in (h1(k) + ih2(k)) mod 11.For example, if keys are 20, 34, 45, 70, and 56.

Double Hashing

Due to the second hashing function, it distributes hash keys uniformly. It gives a guarantee to occupy all the slots. It does not occupy extra space and also there is no primary and secondary clustering. Average and best case time complexity is O(1) and its worst-case time complexity is O(n).

--

--

Apurva Komnak
Apurva Komnak

Written by Apurva Komnak

Instumentation and Control Engineer

Responses (1)