Reputation: 631
What is the difference between primary and secondary clustering in hash collision management?
Upvotes: 44
Views: 77378
Reputation: 11
Primary Clustering: Primary clustering, refers to the phenomenon where collided keys form clusters or chains in adjacent slots during collision resolution. When a collision occurs and linear probing is used, consecutive slots are checked until an empty slot is found. If keys frequently collide and form clusters, it can lead to longer probe sequences and increased search times. Primary clustering can result in degraded performance as the clusters become longer and more keys are packed together.
Secondary Clustering: Secondary clustering refers to the tendency for keys to form clusters in the probe sequence due to a poor choice of secondary hash function or step size in double hashing. When using double hashing, the secondary hash function is used to determine the step size for each probe. If the step size is not well-distributed or poorly chosen, it can lead to secondary clustering. Keys that collide at the same primary position may follow the same probe sequence and form clusters at different slots in the table. Secondary clustering can also impact the search performance and efficiency of hash-based data structures.
Upvotes: 1
Reputation: 48149
x
, subsequent probes go to x+1
,
x+2
, x+3
and so on, this results in Primary Clustering.x
, probes go to x+1
, x+4
, x+9
,
x+16,
x+25
and so on, this results in Secondary Clustering.Upvotes: 141
Reputation: 43738
Primary clustering means that if there is a cluster and the initial position of a new record would fall anywhere in the cluster the cluster size increases. Linear probing leads to this type of clustering.
Secondary clustering is less severe, two records do only have the same collision chain if their initial position is the same. For example quadratic probing leads to this type of clustering.
Upvotes: 29