Reputation: 1903
Can we format a std::regex string with whitespace/linebreak which get ignored - just for better reading? Is there any option available like in Python VERBOSE)?
Without verbose:
charref = re.compile("&#(0[0-7]+"
"|[0-9]+"
"|x[0-9a-fA-F]+);")
With verbose:
charref = re.compile(r"""
&[#] # Start of a numeric entity reference
(
0[0-7]+ # Octal form
| [0-9]+ # Decimal form
| x[0-9a-fA-F]+ # Hexadecimal form
)
; # Trailing semicolon
""", re.VERBOSE)
Upvotes: 2
Views: 1212
Reputation: 2765
Simply split the string into multiple literals and use C++ comments like so:
std::regex rgx(
"&[#]" // Start of a numeric entity reference
"("
"0[0-7]+" // Octal form
"|[0-9]+" // Decimal form
"|x[0-9a-fA-F]+" // Hexadecimal form
")"
";" // Trailing semicolon
);
They will then be combined to "&[#](0[0-7]+|[0-9]+|x[0-9a-fA-F]+);"
by the compiler. This will also allow you to add whitespaces to the regex which won't be ignored. However the additional quotation marks can make this a little bit laborious to write.
Upvotes: 8
Reputation: 275750
inline std::string remove_ws(std::string in) {
in.erase(std::remove_if(in.begin(), in.end(), std::isspace), in.end());
return in;
}
inline std::string operator""_nows(const char* str, std::size_t length) {
return remove_ws({str, str+length});
}
now, this doesn't support # comments
, but adding that should be easy. Simply create a function that strips them from a string, and do this:
std::string remove_comments(std::string const& s)
{
std::regex comment_re("#[^\n]*\n");
return std::regex_replace(s, comment_re, "");
}
// above remove_comments not tested, but you get the idea
std::string operator""_verbose(const char* str, std::size_t length) {
return remove_ws( remove_comments( {str, str+length} ) );
}
Once finished, we get:
charref = re.compile(R"---(
&[#] # Start of a numeric entity reference
(
0[0-7]+ # Octal form
| [0-9]+ # Decimal form
| x[0-9a-fA-F]+ # Hexadecimal form
)
; # Trailing semicolon
)---"_verbose);
and done.
Upvotes: 5