Confused_Engineer
Confused_Engineer

Reputation: 25

Regular expression <(.*?)> returning < or >

I am attempting to run a regular expression to pull a string of any characters from a file that is contained between "<" and ">". The regex that I have come up with is

[ <(.*?)>]

However, when I run this regex using fscanf I only get a "<" or ">" as my output for anything contained within the signs.

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>

int next_word(FILE* filename,char word[254])
{
    if (fscanf(filename, "%254[<(.*?)>]", word) == 1)
    {
        printf("%s\n",word);
        return 1;
    }
    else if (fscanf(filename, "%[^a-zA-Z]", word) == 1) { return 1; }
    else if (fscanf(filename, "%254[a-zA-Z]", word) == 1) {return 1; }
    return 0;
}

int main(int argc, char * argv[])
{
    char word[254];
    FILE *infile;

    infile = fopen(argv[2],"r");
    while(1)
    {
        if(next_word(infile,word) == 0)
        {
            break;
        }
    }
}

My input file is as follows:

<test> this is a line <end>

Which gives the output:

<
>

 <
>

but should give

<test>
<end>

Upvotes: 0

Views: 56

Answers (1)

Tom&#39;s
Tom&#39;s

Reputation: 2506

Because the format string of scanf family are not regex, and I don't think that your regex will work (you can use online regex testing).

You can try

fscanf(filename, "<%254[^>]>", word) == 1

Upvotes: 2

Related Questions