Reputation: 12616
I'm trying to implement an efficient hash table where collisions are solved using linear probing with step. This function has to be as efficient as possible. No needless =
or ==
operations. My code is working, but not efficient. This efficiency is evaluated by an internal company system. It needs to be better.
There are two classes representing a key/value pair: CKey
and CValue
. These classes each have a standard constructor, copy constructor, and overridden operators =
and ==
. Both of them contain a getValue()
method returning value of internal private variable. There is also the method getHashLPS()
inside CKey
, which return hashed position in hash table.
int getHashLPS(int tableSize,int step, int collision) const
{
return ((value + (i*step)) % tableSize);
}
Hash table.
class CTable
{
struct CItem {
CKey key;
CValue value;
};
CItem **table;
int valueCounter;
}
Methods
// return collisions count
int insert(const CKey& key, const CValue& val)
{
int position, collision = 0;
while(true)
{
position = key.getHashLPS(tableSize, step, collision); // get position
if(table[position] == NULL) // free space
{
table[position] = new CItem; // save item
table[position]->key = CKey(key);
table[position]->value = CValue(val);
valueCounter++;
break;
}
if(table[position]->key == key) // same keys => overwrite value
{
table[position]->value = val;
break;
}
collision++; // current positions is full, try another
if(collision >= tableSize) // full table
return -1;
}
return collision;
}
// return collisions count
int remove(const CKey& key)
{
int position, collision = 0;
while(true)
{
position = key.getHashLPS(tableSize, step, collision);
if(table[position] == NULL) // free position - key isn't in table or is unreachable bacause of wrong rehashing
return -1;
if(table[position]->key == key) // found
{
table[position] = NULL; // remove it
valueCounter--;
int newPosition, collisionRehash = 0;
for(int i = 0; i < tableSize; i++, collisionRehash = 0) // rehash table
{
if(table[i] != NULL) // if there is a item, rehash it
{
while(true)
{
newPosition = table[i]->key.getHashLPS(tableSize, step, collisionRehash++);
if(newPosition == i) // same position like before
break;
if(table[newPosition] == NULL) // new position and there is a free space
{
table[newPosition] = table[i]; // copy from old, insert to new
table[i] = NULL; // remove from old
break;
}
}
}
}
break;
}
collision++; // there is some item on newPosition, let's count another
if(collision >= valueCounter) // item isn't in table
return -1;
}
return collision;
}
Both functions return collisions count (for my own purpose) and they return -1
when the searched CKey
isn't in the table or the table is full.
Tombstones are forbidden. Rehashing after removing is a must.
Upvotes: 1
Views: 4493
Reputation: 2929
The biggest change for improvement I see is in the removal function. You shouldn't need to rehash the entire table. You only need to rehash starting from the removal point until you reach an empty bucket. Also, when re-hashing, remove and store all of the items that need to be re-hashed before doing the re-hashing so that they don't get in the way when placing them back in.
Another thing. With all hashes, the quickest way to increase efficiency to to decrease the loadFactor (the ratio of elements to backing-array size). This reduces the number of collisions, which means less iterating looking for an open spot, and less rehashing on removal. In the limit, as the loadFactor approaches 0, collision probability approaches 0, and it becomes more and more like an array. Though of course memory use goes up.
Update You only need to rehash starting from the removal point and moving forward by your step size until you reach a null. The reason for this is that those are the only objects that could possibly change their location due to the removal. All other objects would wind up hasing to the exact same place, since they don't belong to the same "collision run".
Upvotes: 2
Reputation: 44250
A possible improvement would be to pre-allocate an array of CItems, that would avoid the malloc()s / news and free() deletes; and you would need the array to be changed to "CItem *table;"
But again: what you want is basically a smooth ride in a car with square wheels.
Upvotes: 0