Reputation: 694
I am in the process of learning Tcl and regular expressions. I have a task that I need a
Tcl script to preform and need some help.
I have a C++ header file that I want to parse into a table. It's a file that defines constants.
There are 2 forms that I need to parse:
const int a = 0x00000001; //Comment for this variable
const int b = 0x00000003; //Comment for this variable
and definitions in an enum like:
CONSTNAMEA = MACROA | MACROB | 0x000A, //Comment for this variable
CONSTNAMEB = MACROA | MACROB | 0x00C1, //Comment for this variable
In the first group, i needed to replace '=', and ';' with '|'. This was easily done with regsub. However, the second group is a bit more complicated and i cant seem to get it right.
What I want to be able to do is pull out 'CONSTNAMEA', '0x000A', and the comments into separate variables.
My thought is that I need three regex's. One to parse out the name, another for the number, and a third for the comment.
The name regex would be "Start at beginning of string and stop at '='"
The number would be '|' {anything} ','
And the comment would be "//" {anything} '\n'
Correct? I would appreciate any help with constructing these regular expressions!
Upvotes: 1
Views: 226
Reputation: 154876
A single regular expression should suffice to capture all three substrings from the line:
^\s*([a-zA-Z_]+)\s*=(?:\s*[a-zA-Z_]+\s*\|)*\s*([0-9a-fA-Fx]+),\s*\/\/(.*)$
The name will be available as the first group, the number as the second one, and the comment as the third one.
To debug expressions like this one, I recommend a tool such as regexper, which will convert a regular expression like the above into an easy to follow railroad diagram.
Upvotes: 1