Matthew Brzezinski
Matthew Brzezinski

Reputation: 1794

HashTable not adding strings

I'm implementing my own HashTable for an assignment, and I need to read from a text file, then add its contents to the HashTable. There are a couple of problems that I have run into,

1) Some hash values for strings are coming out as negative numbers.

2) Some words are not being added.

Here's my code for my hash function:

   public int hash(String key) {
        int hashkey = key.hashCode();

        return hashkey % arraySize;
    } // END hash()

For the word computer it returns the integer -97, to get around this I included a if statement to make it positive if the integer is negative. However, even with this the word computer, is not added into my HashTable at index 97.

Some other words that do not get added are: tree, subtree, record, single among others. Here are my functions for File I/O and adding them into the HashTable:

public static void readParagraph() {
    BufferedReader reader;
    String inputLine;

    try {
        reader = new BufferedReader(new FileReader(".\\src\\paragraph.txt"));
        while((inputLine = reader.readLine()) != null) {
            String[] strings = inputLine.split("(\\s|\\p{Punct})+");
            insertParagraph(strings);
        }
    } catch (IOException e) {
        System.out.println("Error: " + e);
    }
} // END readParagraph()

private static void insertParagraph(String[] strings) {
    Link link;
    for (String string : strings) {
        if (!string.equals("")) {
            link = new Link(string.replaceAll("[^a-zA-Z'\\s]", "").toLowerCase());
            paragraph.insert(link);
        }
    }
} // END insertParagraph()

Does anyone know what is wrong as to why some words are not being added? Or as to why I am getting negative numbers for hash values?


EDIT: More Information

public class A4Q3 {
    static HashTable dictionary = new HashTable(1009);
    static HashTable paragraph = new HashTable(164);
    public static void main(String[] args) {
        readParagraph();
        paragraph.displayTable();
        System.out.println();
        System.out.println(paragraph.wordCount);
    } // END main()

    public void insert(Link link) {
        String key = link.getKey();
        Link previous = null;
        Link current = first;

        while(current != null && key.compareTo(current.getKey()) > 0) {
            previous = current;
            current = current.getNext();
        }

        if(previous == null) {
            first = link;
        } else {
            previous.setNext(link);
            link.setNext(current);
        }
    } // END insert()

Link Class:

public class Link {
private String data;
private Link next;

public Link(String data) {
    this.data = data;
} // END Link()

public String getKey() {
    return data;
} // END getKey()

public void displayLink() {
    System.out.print(data + " ");
} // END displayLink()

public Link getNext() {
    return next;
} // END getNext()

public void setNext(Link next) {
    this.next = next;
} // END setNext()

} // END Link

Upvotes: 0

Views: 141

Answers (1)

Steve P.
Steve P.

Reputation: 14709

You're not handling negative hash codes properly:

   public int hash(String key) 
   {
        int hashkey = key.hashCode();

        return hashkey % arraySize;
   } 

As you said, if hashkey is negative, you have issues, so what you can do is change your hash function.

Hash-Code Fix:

   public int hash(String key) 
   {
        int hashkey = key.hashCode();

        return (hashkey & 0x7FFFFFFF) % arraySize;
   } 

This will perform a bitwise and with hashkey and 0x7FFFFFFF, which is hexadecimal for 31 1's, ie:

01111111 11111111 11111111 11111111

So, it will turn any input into a positive number, because the bitwise and will and every digit in hashkey with a 1, except for the most-significant digit, which it will and with a 0. Since java uses two's complement, the most-significant digit is used to indicate the sign. Since 0 & 1 is 0, this value will now always be positive.

The reason that you're getting negative values for the hashes of the String is because of the hashCode() for String.

Reason for negative hash values for String:

public int  hashCode() 
{
        int h = hash;

        if (h == 0) 
        {
            int off = offset;
            char val[] = value;
            int len = count;

            for (int i = 0; i < len; i++) 
            {
                h = 31*h + val[off++];
            }

            hash = h;
        }

        return h;
    }

Note that your source code may not be the same, but nonetheless, the reason for the negative numbers is. If h becomes greater than Integer.MAX_VALUE, it will wrap around to the Integer.MIN_VALUE, ie:

Integer.MAX_VALUE + 1 == Integer.MIN_VALUE 

Upvotes: 2

Related Questions