Reputation: 3164
I am a beginner to Regular expressions although I know how to use them, searching, replacing...
I want to write a program that detects C++ valid identifiers. e.g:
_ _f _8 my_name age_ var55 x_ a
And so on...
So I've tried this:
std::string str = "9_var 57age my_marks cat33 fit*ell +bin set_";
std::string pat = "[_a-z]+[[:alnum:]]*";
std::regex reg(pat, std::regex::icase);
std::smatch sm;
if(std::regex_search(str, sm, reg))
std::cout << sm.str() << '\n';
else
std::cout << "no valid C++ identifier found!\n";
The output:
_var
But as we know a C++ identifier should not start with a digit so 9_var
mustn't be a candidate for the matches. But what I see here is the compiler takes only the sub-string _var
from 9_var
and treated it as a much. I want to discard a whole word such "9_var". I need some way to get only matches those only start with an alphabetic character or an underscore.
So how can I write a program that detects valid identifiers? Thank you!
Upvotes: -1
Views: 792
Reputation: 7838
Your pattern isn't checking for word boundaries, so it's able to match parts of a string. An updated regex looks like this:
std::string pat = "\\b[_a-z]+[[:alnum:]]*\\b";
With only that updated, the match is the first valid identifier in your string
.
$ ./a.out
my_marks
If you want to find all the valid identifiers, you'll need to loop. You'll also need to filter out reserved words, but regex isn't a good solution for that.
Upvotes: 1