What is primary and secondary clustering in hash?

Rickx picture Rickx · Jan 2, 2015 · Viewed 53.4k times · Source

I am confused for the last few days in finding the difference between primary and secondary clustering in hash collision management topic in the textbook I am reading.

Answer

Yogesh Umesh Vaity picture Yogesh Umesh Vaity · Apr 10, 2016

Primary Clustering

  1. Primary clustering is the tendency for a collision resolution scheme such as linear probing to create long runs of filled slots near the hash position of keys.
  2. If the primary hash index is x, subsequent probes go to x+1, x+2, x+3 and so on, this results in Primary Clustering.
  3. Once the primary cluster forms, the bigger the cluster gets, the faster it grows. And it reduces the performance.

enter image description here


Secondary Clustering

  1. Secondary clustering is the tendency for a collision resolution scheme such as quadratic probing to create long runs of filled slots away from the hash position of keys.
  2. If the primary hash index is x, probes go to x+1, x+4, x+9, x+16, x+25 and so on, this results in Secondary Clustering.
  3. Secondary clustering is less severe in terms of performance hit than primary clustering, and is an attempt to keep clusters from forming by using Quadratic Probing. The idea is to probe more widely separated cells, instead of those adjacent to the primary hash site.

enter image description here