FilT
FilT

Reputation: 5

Converting Static to Dynamic Hash Table

I'be been given the assignment to change a given hash-table with static memory allocation to dynamic, so that more memory can be allocated past the limit while the program is running. I by no means am demanding a solution to the problem, I'm simply asking if anyone knows a good place to start or which aspects of the code I need to pay attention to as I'm a little lost and confused with hash tables. I know the enums and the constructor needs to be changed but I'm not sure of much else. Here is the code given, and thanks in advance for any advice:

#ifndef TABLE1_H
#define TABLE1_H
#include <cstdlib>    // Provides size_t
#include <cassert>  // Provides assert

namespace main_savitch_12A
{
template <class RecordType>
class table
{
public:

enum { CAPACITY = 30 }; 
    // CONSTRUCTOR
    table( );
    // MODIFICATION MEMBER FUNCTIONS
    void insert(const RecordType& entry);
    void remove(int key);
    // CONSTANT MEMBER FUNCTIONS
    bool is_present(int key) const;
    void find(int key, bool& found, RecordType& result) const;
    size_t size( ) const { return used; }
private:
    // MEMBER CONSTANTS -- These are used in the key field of special records.
    enum { NEVER_USED = -1 };
    enum { PREVIOUSLY_USED = -2 };
    // MEMBER VARIABLES
    RecordType data[CAPACITY];
    size_t used;
    // HELPER FUNCTIONS
    size_t hash(int key) const;
    size_t next_index(size_t index) const;
    void find_index(int key, bool& found, size_t& index) const;
    bool never_used(size_t index) const;
    bool is_vacant(size_t index) const;
};

    template <class RecordType>
table<RecordType>::table( )
{
    size_t i;

    used = 0;
    for (i = 0; i < CAPACITY; ++i)
        data[i].key = NEVER_USED;  // Indicates a spot that's never been used.
}

template <class RecordType>
void table<RecordType>::insert(const RecordType& entry)
// Library facilities used: cassert
{
    bool already_present;   // True if entry.key is already in the table
    size_t index;        // data[index] is location for the new entry

    assert(entry.key >= 0);

    // Set index so that data[index] is the spot to place the new entry.
    find_index(entry.key, already_present, index);

    // If the key wasn't already there, then find the location for the new entry.
    if (!already_present)
    {
        assert(size( ) < CAPACITY);
        index = hash(entry.key);
        while (!is_vacant(index))
            index = next_index(index);
        ++used;
    }

    data[index] = entry;
    size_t i;
    for (i=0; i<CAPACITY; i++) cout << data[i].key << ' ';
    cout << endl;
}

template <class RecordType>
void table<RecordType>::remove(int key)
// Library facilities used: cassert
{
    bool found;        // True if key occurs somewhere in the table
    size_t index;   // Spot where data[index].key == key

    assert(key >= 0);

    find_index(key, found, index);
    if (found)
    {   // The key was found, so remove this record and reduce used by 1.
        data[index].key = PREVIOUSLY_USED; // Indicates a spot that's no longer in use.
        --used;
    }
}

template <class RecordType>
bool table<RecordType>::is_present(int key) const
// Library facilities used: assert.h
{
    bool found;
    size_t index;

    assert(key >= 0);

    find_index(key, found, index);
    return found;
}

template <class RecordType>
void table<RecordType>::find(int key, bool& found, RecordType& result) const
// Library facilities used: cassert.h
{
    size_t index;

    assert(key >= 0);

    find_index(key, found, index);
    if (found)
        result = data[index];
}

template <class RecordType>
inline size_t table<RecordType>::hash(int key) const
{
    return (key % CAPACITY);
}

template <class RecordType>
inline size_t table<RecordType>::next_index(size_t index) const
// Library facilities used: cstdlib
{
    return ((index+1) % CAPACITY);
}

template <class RecordType>
void table<RecordType>::find_index(int key, bool& found, size_t& i) const
// Library facilities used: cstdlib
{
size_t count; // Number of entries that have been examined

count = 0;
i = hash(key);
while((count < CAPACITY) && (data[i].key != NEVER_USED) && (data[i].key != key))
{
    ++count;
    i = next_index(i);
}
found = (data[i].key == key);
}

template <class RecordType>
inline bool table<RecordType>::never_used(size_t index) const
{
return (data[index].key == NEVER_USED);
}

template <class RecordType>
inline bool table<RecordType>::is_vacant(size_t index) const
{
return (data[index].key == NEVER_USED);// || (data[index].key == PREVIOUSLY_USED);
}
}

#endif

Upvotes: 0

Views: 522

Answers (2)

chux
chux

Reputation: 153348

@Mark B ideas are the answer.

Wanted to add:
Recommend that your table size CAPACITY is a prime number. Modding a weak hash function hash(key) by a prime number helps the dispersion. (A good hash function needs no help.)

Your growth steps are usually exponential and could be built into a lookup table. Various authors suggests ratios between 1.5 & 4. e. g. Grow2x[] = { 0, 1, 3, 7, 13, 31, 61, ... (primes just under a power of 2 };

Say you are growing 2x each time and your growth loading is 100%. Then your shrink loading should be about the 70% (geometric mean of 100%/2x and 100%). You want to avoid growing/shrinking should your inserts/deletions hover about a critical level.

Upvotes: 0

Mark B
Mark B

Reputation: 96243

A few points to think about:

  • You can use vector instead of the C-style array to hold your elements, because it allows for dynamic resizing.
  • When you need to grow the table, you'll have to rehash all existing elements to put them in new locations in a new container (which you can then swap with the existing container when the rehashing is complete).
  • You'll want to be able to specify a load factor that decides when to grow.
  • You'll need to decide if you want to allow the container to shrink the allocated space, again at some threshold.

Upvotes: 1

Related Questions