Reputation: 1794
I'm implementing my own HashTable for an assignment, and I need to read from a text file, then add its contents to the HashTable. There are a couple of problems that I have run into,
1) Some hash values for strings are coming out as negative numbers.
2) Some words are not being added.
Here's my code for my hash function:
public int hash(String key) {
int hashkey = key.hashCode();
return hashkey % arraySize;
} // END hash()
For the word computer
it returns the integer -97
, to get around this I included a if statement to make it positive if the integer is negative. However, even with this the word computer
, is not added into my HashTable at index 97
.
Some other words that do not get added are: tree, subtree, record, single
among others. Here are my functions for File I/O and adding them into the HashTable:
public static void readParagraph() {
BufferedReader reader;
String inputLine;
try {
reader = new BufferedReader(new FileReader(".\\src\\paragraph.txt"));
while((inputLine = reader.readLine()) != null) {
String[] strings = inputLine.split("(\\s|\\p{Punct})+");
insertParagraph(strings);
}
} catch (IOException e) {
System.out.println("Error: " + e);
}
} // END readParagraph()
private static void insertParagraph(String[] strings) {
Link link;
for (String string : strings) {
if (!string.equals("")) {
link = new Link(string.replaceAll("[^a-zA-Z'\\s]", "").toLowerCase());
paragraph.insert(link);
}
}
} // END insertParagraph()
Does anyone know what is wrong as to why some words are not being added? Or as to why I am getting negative numbers for hash values?
EDIT: More Information
public class A4Q3 {
static HashTable dictionary = new HashTable(1009);
static HashTable paragraph = new HashTable(164);
public static void main(String[] args) {
readParagraph();
paragraph.displayTable();
System.out.println();
System.out.println(paragraph.wordCount);
} // END main()
public void insert(Link link) {
String key = link.getKey();
Link previous = null;
Link current = first;
while(current != null && key.compareTo(current.getKey()) > 0) {
previous = current;
current = current.getNext();
}
if(previous == null) {
first = link;
} else {
previous.setNext(link);
link.setNext(current);
}
} // END insert()
Link Class:
public class Link {
private String data;
private Link next;
public Link(String data) {
this.data = data;
} // END Link()
public String getKey() {
return data;
} // END getKey()
public void displayLink() {
System.out.print(data + " ");
} // END displayLink()
public Link getNext() {
return next;
} // END getNext()
public void setNext(Link next) {
this.next = next;
} // END setNext()
} // END Link
Upvotes: 0
Views: 141
Reputation: 14709
You're not handling negative hash codes properly:
public int hash(String key)
{
int hashkey = key.hashCode();
return hashkey % arraySize;
}
As you said, if hashkey
is negative, you have issues, so what you can do is change your hash
function.
Hash-Code Fix:
public int hash(String key)
{
int hashkey = key.hashCode();
return (hashkey & 0x7FFFFFFF) % arraySize;
}
This will perform a bitwise and
with hashkey
and 0x7FFFFFFF
, which is hexadecimal for 31 1's, ie:
01111111 11111111 11111111 11111111
So, it will turn any input into a positive number, because the bitwise and
will and
every digit in hashkey
with a 1, except for the most-significant digit, which it will and
with a 0. Since java uses two's complement, the most-significant digit is used to indicate the sign. Since 0 & 1
is 0, this value will now always be positive.
The reason that you're getting negative values for the hashes of the String
is because of the hashCode()
for String
.
Reason for negative hash values for String:
public int hashCode()
{
int h = hash;
if (h == 0)
{
int off = offset;
char val[] = value;
int len = count;
for (int i = 0; i < len; i++)
{
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
Note that your source code may not be the same, but nonetheless, the reason for the negative numbers is. If h
becomes greater than Integer.MAX_VALUE
, it will wrap around to the Integer.MIN_VALUE
, ie:
Integer.MAX_VALUE + 1 == Integer.MIN_VALUE
Upvotes: 2