pasadinhas
pasadinhas

Reputation: 373

scanf regex - C

I needed to read a string until the following sequence is written: \nx\n :

(.....)\n
x\n

\n is the new line character and (.....) can be any characters that may include other \n characters.

scanf allows regular expressions as far as I know, but i can't make it to read a string untill this pattern. Can you help me with the scanf format string?


I was trying something like:

char input[50000];
scanf(" %[^(\nx\n)]", input);

but it doesn't work.

Upvotes: 12

Views: 36336

Answers (2)

zwol
zwol

Reputation: 140836

scanf does not support regular expressions. It has limited support for character classes but that's not at all the same thing.

Never use scanf, fscanf, or sscanf, because:

  1. Numeric overflow triggers undefined behavior. The C runtime is allowed to crash your program just because someone typed too many digits.
  2. Some format specifiers (notably %s) are unsafe in exactly the same way gets is unsafe, i.e. they will cheerfully write past the end of the provided buffer and crash your program.
  3. They make it extremely difficult to handle malformed input robustly.

You don't need regular expressions for this case; read a line at a time with getline and stop when the line read is just "x". However, the standard (not ISO C, but POSIX) regular expression library routines are called regcomp and regexec.

Upvotes: 13

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726987

scanf allows regular expressions as far as I know

Unfortunately, it does not allow regular expressions: the syntax is misleadingly close, but there is nothing even remotely similar to the regex in the implementation of scanf. All that's there is a support for character classes of regex, so %[<something>] is treated implicitly as [<something>]*. That's why your call of scanf translates into read a string consisting of characters other than '(', ')', 'x', and '\n'.

To solve your problem at hand, you can set up a loop that read the input character by character. Every time you get a '\n', check that

  • You have at least three characters in the input that you've seen so far,
  • That the character immediately before '\n' is an 'x', and
  • That the character before the 'x' is another '\n'

If all of the above is true, you have reached the end of your anticipated input sequence; otherwise, your loop should continue.

Upvotes: 26

Related Questions