Reputation: 183
I'm writing some C++, trying to implement a hash table using arrays.
The hashing function is taken from this Stack Overflow post.
Since the hash table is dynamically resized in runtime, the following simple strategy is implemented:
% table's total size
.# of elements inserted (= current size) >= table's total size
,new table
with double the size, and re-insert all the elements.The table's buckets each hold a value, and a boolean indicating whether said element should be inside the table.
Note that the table is being made only for non-negative integer values.
I can't get past the table construction: Memory usage gradually increases, as well as CPU's.
This should mean that there's a problem related with the condition checking on when to grow the table's size.
Table's constructor and destructor:
// I'm defining these here because they depend on the integer's bit size.
#define INVERSE 0x119de1f3
#define HASHER 0x45d9f3b
HashTable::HashTable(size_t initial_size, uint32_t *initial_data)
{
total_size = 4U * initial_size;
current_size = initial_size;
data = new table[total_size];
build(initial_data);
}
HashTable::~HashTable()
{
delete[] data;
total_size = 0U;
current_size = 0U;
}
To build, we hash and put (insert) each initial element into the table:
void HashTable::build(uint32_t *initial_data)
{
for (size_t i = 0U; i < current_size; i += 1U)
{
insert(initial_data[i]);
}
}
To insert, we also use a hash function:
void HashTable::insert(uint32_t num)
{
uint32_t index = hash(num);
try
{
data[index].num = num;
data[index].exists = true;
current_size += 1U;
}
catch (out_of_range& error)
{
; // TODO stuff after writing main.
}
}
uint32_t HashTable::hash(uint32_t num)
{
if (current_size >= total_size / 2U) {
doubleTable();
for (size_t i = 0U; i < current_size; i += 1U)
{
insert(data[i].num);
}
}
num = ((num >> 16) ^ num) * HASHER;
num = ((num >> 16) ^ num) * HASHER;
num = (num >> 16) ^ num;
return (num % total_size);
}
Lastly, here is a dummy main function:
int main(void)
{
uint32_t initial_data[3] = {1, 3, 5};
HashTable h(3U, initial_data);
return 0;
}
This is how the table is structured:
struct table
{
uint32_t num;
bool exists = false;
};
And the relevant definitions:
class HashTable
{
private:
table *data;
size_t total_size;
size_t current_size;
uint32_t hash(uint32_t num);
uint32_t unhash(uint32_t num);
void build(uint32_t *initial_data);
void doubleTable(void);
public:
HashTable(size_t initial_size, uint32_t *initial_data);
~HashTable();
void insert(uint32_t num);
After compiling with -ggdb3
and running valgrind, I get a pestering uninitialized value of size 8
error.
Here are some valgrind logs:
--1632-- Valgrind options:
--1632-- --leak-check=full
--1632-- --show-leak-kinds=all
--1632-- --track-origins=yes
--1632-- --verbose
--1632-- --log-file=valgrind-out.txt
==1632== Use of uninitialised value of size 8
==1632== at 0x1093E8: HashTable::insert(unsigned int) (in /home/xlxs4/Git/CCDT/src/structures/hashtable/a.out)
==1632== by 0x1095DC: HashTable::hash(unsigned int) (in /home/xlxs4/Git/CCDT/src/structures/hashtable/a.out)
==1632== by 0x1093BD: HashTable::insert(unsigned int) (in /home/xlxs4/Git/CCDT/src/structures/hashtable/a.out)
==1632== by 0x1096B0: HashTable::build(unsigned int*) (in /home/xlxs4/Git/CCDT/src/structures/hashtable/a.out)
==1632== by 0x10933C: HashTable::HashTable(unsigned long, unsigned int*) (in /home/xlxs4/Git/CCDT/src/structures/hashtable/a.out)
==1632== by 0x10920A: main (main.cpp:15)
==1632== Uninitialised value was created by a heap allocation
==1632== at 0x483950F: operator new[](unsigned long) (vg_replace_malloc.c:423)
==1632== by 0x1092F7: HashTable::HashTable(unsigned long, unsigned int*) (in /home/xlxs4/Git/CCDT/src/structures/hashtable/a.out)
==1632== by 0x10920A: main (main.cpp:15)
I've already gone through this post on eliminating undefined values and this SO post about uninitialized value of size 8.
What am I missing?
Let me know if you want any further classifications, or want me to further explain my thought process.
Thanks in advance :)
Upvotes: 2
Views: 174
Reputation: 3386
xlxs4:
This shouldn't be duplicated as @Frank said above. Nonetheless, I figured it out. I re-wrote the code according to the rule of three principles and I utilized initialization lists. What caused the problem I described in the post, is that i < current_size in build will never terminate, as current_size gets incremented each time insert is being called. I don't have sufficient reputation to answer my own question, as it seems. Thanks to all who helped!
Upvotes: 1