Reputation: 135
I am building a program that is suppose to look for words from a file that has 2 vowels in a row and ends with either ly or ing. Im currently having some issues with how i am suppose to deal with reading words from the file. My current code looks a little like this
fgets(string, BUFF_SIZE, file);
char *ptr = strtok(string, delim);
reti = regcomp(®ex, "[aoueiyAOUEIY]+[aoueiyAOUEIY].{0,}(ly|ing|LY|ING)$", REG_EXTENDED);
if (reti){
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
/* Execute regular expression */
reti = regexec(®ex, ptr , 0, NULL, 0);
if (!reti) {
puts("Match");
printf(" %s\n", string);
}
else if (reti == REG_NOMATCH) {
puts("No match");
printf(" %s\n", string);
}
else {
regerror(reti, ®ex, msgbuf, sizeof(msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
exit(1);
}
Im aware that i need some sort of loop so that i can check more then one word,i wanted to try how strtok would work but realised that i stil face the same problem. If i for example have the line fairly standing. jumping? hoping! there is just to many "chars" that a word can end on, how to i make my delim understand that it's at an end of a word. Im thinking of doing a second regex that only has letter in it and compare until i get a reg no match. But the issue with that is that the buffer will get full very quickly.
Upvotes: 0
Views: 74
Reputation: 44329
For a task like this it's important to define "what is a word".
For instance consider "bad!idea this!is" is that the 4 words "bad", "idea" "this" "is" or is it the 4 words "bad!", "idea" "this!" "is" or is it just the two words "bad!idea" "this!is".
And what if the input is "bad3idea this9is" ?
Sometimes the standard functions (e.g. strtok
, fscanf
) will fit your needs and in such cases you should use them.
In case the standard functions do not fit, you can use fgetc
to implement something that fit your needs.
The example below will consider anything that is not a letter (i.e. not a-z or A-Z) as word delimiters.
int end_of_file = 0;
while(!end_of_file)
{
int index = 0;
int c = fgetc(file);
if (c == EOF) break; // Done with the file
while (isalpha(c))
{
string[index] = c;
++index;
if (index == BUFF_SIZE)
{
// oh dear, the buffer is too small
//
// Just end the program...
exit(1);
}
c = fgetc(file);
if (c == EOF)
{
end_of_file = 1;
break;
}
}
string[index] = '\0';
if (index >= 4) // We need at least 4 chars for a match
{
// do the regex stuff
}
}
Upvotes: 1