Reputation: 16994
I'd like to parse simple C++ typedef instructions such as
typedef Class NewNameForClass;
typedef Class::InsideTypedef NewNameForTypedef;
typedef TemplateClass<Arg1,Arg2> AliasForObject;
I have written the corresponding grammar that i'd like to see used in parsing.
Name <- ('_'|letter)('_'|letter|digit)*
Type <- Name
Type <- Type::Name
Type <- Name Templates
Templates <- '<' Type (',' Type)* '>'
Instruction <- "typedef" Type Name ';'
Once this is parsed, all i'll want to do is to generate xml with the same information (but layed out differently)
What is the most effective language for writing such a program ? How can you achieve this ?
EDIT : What i have come up with using Boost Spirit (it's not perfect, but it's good enough for me, at least for now)
rule<> sep_p = space_p;
rule<> name_p = (ch_p('_')|alpha_p) >> *(ch_p('_')|alpha_p|digit_p);
rule<> type_p = name_p
>> !(*sep_p >>str_p("::") >> *sep_p>> name_p)
>> *(*sep_p >> ch_p('*') )
>> !(*sep_p >> str_p("const"))
>> !(*sep_p >> ch_p('&'));
rule<> templated_type_p = name_p >> *sep_p
>> ch_p('<') >> *sep_p
>> (*sep_p>>type_p>>*sep_p)%ch_p(',')
>> ch_p('>') >> *sep_p;
rule<> typedef_p = *sep_p
>> str_p ("typedef")
>> +sep_p >> (type_p|templated_type_p)
>> +sep_p >> name_p
>> *sep_p >> ch_p(';') >> *sep_p;
rule<> typedef_list_p = *typedef_p;
Upvotes: 1
Views: 869
Reputation: 754853
I would alter the grammar slightly
ShortName <- ('_'|letter)('_'|letter|digit)*
Name <- ShortName
Name <- Name::ShortName
Type <- Name
Type <- Name Templates
Templates <- '<' Type (',' Type)* '>'
Instruction <- "typedef" Type Name ';'
Also your grammar leaves out the following cases
Parsing a grammar (i love the irony) is a fairly straight forward operation. If you wanted to actually use the grammar in a functional way, I would say the best bet is a lex/yacc combination.
But from your question it appears that you want to spit it out to another format. There really isn't a language designed for this so I would say use whatever language you're most comfortable with.
Edit
The OP asked about multiple typedef targets. It's perfectly legally for a typedef declaration to have more than 1 target. For Example:
typedef _SomeStruct SomeStruct, *PSomeStruct
This creates 2 typedef names.
Upvotes: 1
Reputation: 545628
Well, since you're apparently already working with/on C++, have you considered using Boost.Spirit? This allows you to hard-code the grammar inline in C++ as a domain-specific language and program against it in normal C++ code.
Upvotes: 1