Reputation: 1692
I am having trouble implementing my hash function for my hash table.
I want to hash my words such that A = 1, B = 2, C = 3, and so on. The position of the letter in the word is irrelevant, since we will consider permutations of the word. Moreover, the case of the letter will be irrelevant in this problem as well, so the value of a = the value of A = 1.
And for strings, abc = 1 + 2 + 3 = 6, bc = 2 + 3 = 5, etc.
And for cases where ab = 3 and aaa = 3, I have already had a way to handle that situation. Right now I just want to get the hash value.
The problem I am having right now is that aaa is giving me 1, and ab is giving me 2.
Below is my code:
int hash(char *word)
{
int h = 1;
int i, j;
char *A;
char *a;
// an array of 26 slots for 26 uppercase letters in the alphabet
A = (char *)malloc(26 * sizeof(char));
// an array of 26 slots for 26 lowercase letters in the alphabet
a = (char *)malloc(26 * sizeof(char));
for (i = 0; i < 26; i++) {
A[i] = (char)(i + 65); // fill the array from A to Z
a[i] = (char)(i + 97); // fill the array from a to z
}
for (i = 0; i < strlen(word); i++) {
//printf("HIT\n");
for (j = 0; j < 26; j++) {
// upper and lower case have the same hash value
if (word[i] == A[j] || word[i] == a[j]) {
h = h + j; // get the hash value of the word
//printf("HIT 2\n");
break;
}
}
}
printf("H: %d\n", h);
return h;
}
Upvotes: 0
Views: 2531
Reputation: 16156
Multiple issues:
Why a dynamic array? You know the size, it isn't going to change. You could use
char A[26];
char a[26]; // you can also add initialisation, e.g. = {'a', 'b', ...
Why an array in the first place?
So, here is the quick fix, staying close to your code.
Taking all of the above into account, you could simplify to:
int hash(char const * string) {
int h = 0;
for (; *string; ++string) {
int index = tolower(*string) - 'a' + 1;
if ((index > 0) && (index < 27)) {
h += index;
}
}
return h;
}
When only hashing words with non special characters, you need to handle ignored words in the caller somehow.
char hash(char const * string, int * h) {
*h = 0;
for (; *string; ++string) {
int index = tolower(*string) - 'a' + 1;
if ((index > 0) && (index < 27)) {
*h += index;
} else {
return 0;
}
}
return 1;
}
That way you can use the return value to test if the word should be ignored.
Upvotes: 2
Reputation: 2785
change h=h+j
to h=h+j+1
and h=1
to h=0
.
Also you should free the allocated memory so include these lines just before return:
free(A);
free(a);
However I don't understand why so complicated code was written for such a simple task.
A much simpler code can be written:
int hash(char *word)
{
int sum=0;
while(*word != '\0')
{
if(*word >='A' && *word < 'A'+26)
sum=sum+(*word -'A' + 1);
else if(*word >='a' && *word < 'a'+26)
sum=sum+(*word -'a' + 1);
else
return -1;
word++;
}
return sum;
}
Upvotes: 2
Reputation: 20244
I think that changing
int h = 1;
to
int h = 0;
and
h = h + j;
to
h = h + j + 1;
will fix the issue.
The one other problem is that you forgot to free the malloc
ed memory. Also, there is no need to cast the result of malloc
(and family) in C.
This
for (i = 0; i < strlen(word); i++) {
will call strlen
in every iteration of the loop. This will reduce the performance of your program. Use
int len = strlen(word);
for (i = 0; i < len; i++) {
instead, which is much faster as strlen
isn't called in every iteration. Lastly, sizeof(char)
is 1. So you can omit it.
Upvotes: 5