Reputation: 7225
A comment (which should probably be submitted as an answer) has the code
sscanf(string, "<title>%[^<]</title>", extracted_string);
Running the code seems to copy the text between the <title>
tags to extracted_string
, but I cannot find any references to a caret in the printf family, either in the man pages or elsewhere online.
Can someone point me to a resource that explains the use of %[^<]
, and other similar syntax, in the sscanf()
family?
Upvotes: 2
Views: 981
Reputation: 16540
this link explains the [ and ^ usage in scanf family of functions
(emphasis mine)
http://www.cdf.toronto.edu/~ajr/209/notes/printf.html
[
Matches a nonempty sequence of characters from the specified set of accepted characters; the next pointer must be a pointer to char, and there must be enough room for all the characters in the string, plus a terminating null byte. The usual skip of leading white space is suppressed. The string is to be made up of characters in (or not in) a particular set; the set is defined by the characters between the open bracket [ character and a close bracket ] character. The set excludes those characters if the first character after the open bracket is a circumflex (^). To include a close bracket in the set, make it the first character after the open bracket or the circumflex; any other position will end the set. The hyphen character - is also special; when placed between two other characters, it adds all intervening characters to the set. To include a hyphen, make it the last character before the final close bracket. For instance, [^]0-9-] means the set "everything except close bracket, zero through nine, and hyphen". The string ends with the appearance of a character not in the (or, with a circumflex, in) set or when the field width runs out.
Upvotes: 1
Reputation: 134286
From the C11
standard document, chapter §7.21.6.2, Paragraph 12, conversion specifiers, (emphasis mine)
[
Matches a nonempty sequence of characters from a set of expected characters (the scanset).
....
The conversion specifier includes all subsequent characters in the format string, up to and including the matching right bracket (
]
). The characters between the brackets (the scanlist) compose the scanset, unless the character after the left bracket is a circumflex (^
), in which case the scanset contains all characters that do not appear in the scanlist between the circumflex and the right bracket.
A draft version of the standard, found online.
Upvotes: 6
Reputation: 53006
It means match anything that is not a <
, it's not a good idea to do that without specifying the maximum destination buffer length, if your destination buffer can hold say 100 characters, then
char extracted_string[100];
sscanf(string, "<title>%99[^<]</title>", extracted_string);
would be a better solution.
Using strstr()
for this purpose allows you to actually make extracted_string
dynamic.
Upvotes: 3