Reputation: 25
I am attempting to run a regular expression to pull a string of any characters from a file that is contained between "<" and ">". The regex that I have come up with is
[ <(.*?)>]
However, when I run this regex using fscanf I only get a "<" or ">" as my output for anything contained within the signs.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>
int next_word(FILE* filename,char word[254])
{
if (fscanf(filename, "%254[<(.*?)>]", word) == 1)
{
printf("%s\n",word);
return 1;
}
else if (fscanf(filename, "%[^a-zA-Z]", word) == 1) { return 1; }
else if (fscanf(filename, "%254[a-zA-Z]", word) == 1) {return 1; }
return 0;
}
int main(int argc, char * argv[])
{
char word[254];
FILE *infile;
infile = fopen(argv[2],"r");
while(1)
{
if(next_word(infile,word) == 0)
{
break;
}
}
}
My input file is as follows:
<test> this is a line <end>
Which gives the output:
<
>
<
>
but should give
<test>
<end>
Upvotes: 0
Views: 56
Reputation: 2506
Because the format string of scanf family are not regex, and I don't think that your regex will work (you can use online regex testing).
You can try
fscanf(filename, "<%254[^>]>", word) == 1
Upvotes: 2