Reputation: 2499
I am trying to make use questions like this one to devise a regexp that will match and give a function name and all parameters in a very simplified Python-like syntax like the following:
mycall(x, y, hello)
with the desired results:
mycall
x
y
hello
Of course it should also match noparams()
, and any number of parameters. As for my simplifications, I just need parameters names, I don't allow default parameters or something different from a list of comma separated names.
My tries with variants of "(\\s*)([A-Za-z0-9_])+\\(\\)"
just to match a function name string with spaces at the beginning are failing, with this code:
std::regex fnregexp(s);
std::smatch pieces_match;
if (std::regex_match(q, pieces_match, fnregexp))
{
std::cout << ">>>> '" << q << "'" << std::endl;
for (size_t i = 0; i < pieces_match.size(); ++i)
{
std::ssub_match sub_match = pieces_match[i];
std::string piece = sub_match.str();
std::cout << " submatch " << i << ": '" << piece << "'" << std::endl;
}
}
I have the following output for " hello()"
:
>>>> ' hello()'
submatch 0: ' hello()'
submatch 1: ' '
submatch 2: 'o'
With this very basic syntax, is it possible to find name of the function and its parameters?
Cheers!
Upvotes: 0
Views: 129
Reputation: 7880
Use this for the conformance check:
^\\s*[A-Za-z_]\\w* *\\( *(?:[A-Za-z_]\\w* *(?:, *[A-Za-z_]\\w* *)*)?\\)$
and if it's ok use this for extracting the parts of signature:
\\w+
the first submatch is the function name, the others are parameters.
EDIT: The correct synthax for Python is [A-Za-z_][A-Za-z0-9_]*
Upvotes: 1
Reputation: 2647
Matching simple function declarations with regex is feasable. For more complicated things you have exactly the right idea in going with a real parser like Boost Spirit.
The bug in your question is a wrong closing parens in the regex. Compare:
"(\\s*)([A-Za-z0-9_])+\\(\\)" // yours
"(\\s*)([A-Za-z0-9_]+)\\(\\)" // correct
The capture group in your version captures only a single character. Because of how the regex engine works it is the last one matched: the o. The correct version includes the + in the group and captures hello as expected.
Upvotes: 1