Reputation: 99
I have a dictionary of words in a text file and I need to find certain words within the text file. Such as words that are made up of the letters { q, a, z, w, s, x, e, d, c, r, f, v,t,g,b} or words that end in {d,o,u,s}. I am looking for a way in which i can do this. Would it be easiest to put all the words into an array? or should I keep it all in the text file? Ive tried the text file approach but am stuck. Heres what I have. Much Thanks!
int size, count;
char *p;
char *words[];
FILE * dict_file;
dict_file = fopen("MyDictionary.txt", "r");
fseek(dict_file, 0, SEEK_END); // seek to end of file
size = ftell(dict_file); // get current file pointer
fseek(dict_file, 0, SEEK_SET); // seek back to beginning of file
// proceed with allocating memory and reading the file
p = dictionary;
while (p = fgets(p, size, dict_file))
{
p += strlen(p);
words[count] = p;
count++;
}
Upvotes: 0
Views: 6958
Reputation: 10377
If you are working on a POSIX compliant system you might want to take a look at <regex.h>
This way you could search for your words by regular expressions. I guess something like:
"([qazwsxedcrfvtab]+)[^[:alpha:]]"
and "([[:alpha:]]*[dous])[^[:alpha:]]"
in your case, but you should be sure to adept them to your specific needs.
int regcomp(regex_t *preg, const char *regex, int cflags);
int regexec(const regex_t *preg, const char *string, size_t nmatch,
regmatch_t pmatch[], int eflags);
void regfree(regex_t *preg);
would be the functions to take a look upon then.
You could go with something like:
regext_t regex;
regmatch_t *match;
char *pos = p;
int n_matches;
regcomp (®ex, "your-regular-expression", REG_EXTENDED);
n_matches = regex.re_nsub + 1;
match = malloc (n * sizeof (regmatch_t));
while (!regexc (®ex, pos, n_matches, match, 0) {
/* extract key and value from subpatterns
available in match[i] for i-th submatch
... */
pos += match[0].rm_eo;
}
regfree (®ex);
free (match);
Upvotes: 0
Reputation: 129474
Clearly, this is wrong:
FILE * dict_file;
fseek(dict_file, 0, SEEK_END); // seek to end of file
size = ftell(dict_file); // get current file pointer
fseek(dict_file, 0, SEEK_SET); // seek back to beginning of file
// proceed with allocating memory and reading the file
dict_file = fopen("MyDictionary.txt", "r");
You can't (correctly) use a file until you have opened it, so the middle three lines will definitely produce some unpredictable result. Most likely that size becomes a negative number or zero, both of which will probably upset the following fgets
calls.
This is not shown in your code, but I expect you are calling malloc()
or something?
p = dictionary;
And while you are fixing the above errors, you may want to replace this:
while (*p != '\0')
{
p += 1;
}
with:
p += strlen(p)-1;
[You may want to remove the -1
if you actually want a '\0'
between each string
Now, having said that, I would probably take the approach of having an array of pointers to each string, rather than storing everything in one humongous single string. That way, you could simply move from string to string. You can still use your long string like above, but have a secondary variable with the pointers to the start of each string [and keep the zero, so remove the -1 from the above.
I would then write a function that does "is this string consisting of these letters" and another that does "is string ending with these letters". Both should be relatively trivial if you have some idea of how to generally do string handling.
Upvotes: 1