Yousef Hadder
Yousef Hadder

Reputation: 63

scanf skip all until a string

Is it possible using scanf to skip all characters until I reach s specific string.

I have an html file and I want to skip all characters before and including this string: "<h2><a href=" and then read http link between two quotes.

Upvotes: 3

Views: 2858

Answers (2)

Yogeshwar Bagul
Yogeshwar Bagul

Reputation: 31

You can always search for string href=" and set a pointer there. Then copy or scan the string until you encounter a " again.

while (*p != '"') {
    // copy to a buffer
}

Upvotes: 0

Eduardo Pacheco
Eduardo Pacheco

Reputation: 629

What an old question I've stumbled upon. Nevertheless, it's still here and I think I've got a good answer. So, why not leave for the following generations, right?

You've been told scanf can't do it. Well, I disagree, and here is why:

scanf can ignore everything until it finds the first letter of the sought string

scanf ("%*[^<]");

Then it can try to ignore the string you are looking for (char by char).

found = scanf ("<h2><a href=\"%[^\"]", str_link) == 1;

It will fail in case it is not it yet and will stop executing, never getting to the %[^\"] command, which reads/stores everything until " character is found.

In such a case, it returns 0, or EOF, for not being able to execute the scan (it returns how many of the variable it was able to fill)

Now, if it does find, it will finally execute the reading and returns 1.

  • note: you should check the documentation for precise information, which can be found at cplusplus.com

    while ( !found && !feof(stdin) )
    {
        scanf ("%*[^<]");
        found = scanf ("<h2><a href=\"%[^\"]", str_link) == 1;
    }
    

I suppose the rest of the file could just be ignored. Up to you.

This is a good method, I suppose, because it takes full advantage of scanf's speed, and it doesn't require you to store the whole file. The idea can be applied to many other tasks.

scanf is a very powerful tool, though a bit tricky.

Upvotes: 5

Related Questions