Reputation: 39
I'm currently doing a CS course and one of the tasks was to implement a dictionary. Most functions seem to work fine but when it comes to finding the correct amount of misspelled words it fails (talking about check function).
If I include tmp[n] = '\0'
, to indicate the end of a string, the output is correct but I don't understand why exactly.
I thought by declaring the size of the array tmp[]
to strlen(word)
it wouldn't be necessary to indicate when a string ends.
Do garbage values play any role in this? And how could I improve the speed of the check function?
I hope you guys understand what I mean. If I need to include more, just tell me.
without '\0'
:
with:
// Implements a dictionary's functionality
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <string.h>
#include <strings.h>
#include <stdint.h>
#include "dictionary.h"
// Represents a node in a hash table
typedef struct node {
char word[LENGTH + 1];
struct node *next;
} node;
// Number of buckets in hash table
const unsigned int N = 11230;
// Hash table
node *table[N];
// Returns true if word is in dictionary else false
bool check(const char *word) {
char tmp[strlen(word)];
int n = strlen(word);
for (int i = 0; i < n; i++) {
tmp[i] = tolower(word[i]);
}
tmp[n] = '\0';
int index = hash(tmp);
node *head = table[index];
while (head != NULL) {
if (strcasecmp(head->word, word) == 0) {
return true;
} else {
head = head->next;
}
}
return false;
}
// Hashes word to a number
unsigned int hash(const char *word) {
// got this hash function from https://stackoverflow.com/questions/4384359/quick-way-to-implement-dictionary-in-c
// from user lifebalance
unsigned int count;
unsigned int hashValue = 0;
for (count = 0; word[count] != '\0'; count++)
hashValue = word[count] + (hashValue << 6) + (hashValue << 16) - hashValue;
return (hashValue % N);
}
char buffer[LENGTH + 1];
int count = 0;
// Loads dictionary into memory, returning true if successful else false
bool load(const char *dictionary) {
FILE *file = fopen(dictionary, "r");
if (file != NULL) {
while (fscanf(file, "%s", buffer) != EOF) {
node *newnode = malloc(sizeof(node)); // creating new node
if (newnode == NULL) {
return false;
}
strcpy(newnode->word, buffer); // copying buffer into node->word
newnode->next = NULL;
int index = hash(newnode->word); //hashing index
count++; // count words in dictinoary
if (table[index] == NULL) {
table[index] = newnode;
} else {
// solution 1
/*
node **head = &table[index];
newnode->next = *head;
*head = newnode;
*/
node *head = table[index];
newnode->next = head;
table[index] = newnode;
}
}
fclose(file);
return true;
}
return false;
}
// Returns number of words in dictionary if loaded else 0 if not yet loaded
unsigned int size(void) {
if (count > 0) {
return count;
}
return 0;
}
// Unloads dictionary from memory, returning true if successful else false
bool unload(void) {
for (int i = 0; i < N; i++) { // loop through all buckets
node *head = table[i]; // set head pointer to first item in arraz
while (head != NULL) { // until ->next points to null
node *currentnode = head; // currentnode points to what head points
head = head->next; // pointer head points to next node
free(currentnode); // delete/free currentnode which was the head before
}
}
return true;
}
Upvotes: 1
Views: 78
Reputation: 1738
I thought by declaring the size of the array tmp[] to strlen(word) it wouldn't be necessary to indicate when a string ends.
No.
char tmp[strlen(word)];
, Indicates that you are only declaring a char
array named tmp
with the size of strlen(word)
, the contents of tmp
are not predictable and behavior is undefined if you use that in any operation without storing some characters in it with including a \0
at the end.
Do garbage values play any role in this?
if your tmp
does not ended with \0
, you cannot be sure of where it ends, so you may get strange results.
Upvotes: 0
Reputation: 66
Firstly, strings in C must always end with '\0', otherwise string-processing functions won't know where string ends. You too rely on that in implementation of function hash
:
for(count = 0; word[count] != '\0'; count++)
Secondly, "garbage value" is not garbage, strictly speaking, it's uninitialized. Reading uninitialized values is Undefined Behavior, meaning it may break your code in unpredictable way
P.S. Missed that first time, but you do need allocate extra byte for '\0' in tmp
array:
char tmp[strlen(word) + 1)]
Upvotes: 2