Reputation: 5750
I am trying to write a handler to extract parameters from a function, where the parameters are between () and the parameters will be delimited by a command ',' parameters may also be defined as arrays which are comma delimited and wrapped in [].
Examples of what I'm trying to decode:
testA(aaaa, [bbbb,cccc,dddd], eeee)
or
testB([aaaa,bbbb,cccc], dddd, [eeee,ffff])
Basically any combination and any number of parameters, what I want from these would be a list containing:
for testA:
0 : aaaa
1 : [bbbb,cccc,dddd]
2 : eeee
for testB:
0 : [aaaa,bbbb,cccc]
1 : dddd
2 : [eeee,ffff]
I'm trying to write a parser that will give me the same, but a regular expression to do this would be preferred.
This is my coded solution which works written in C++ for Qt5.6:
int intOpSB, intPStart;
//Analyse and count the parameters
intOpSB = intPStart = 0;
for( int p=0; p<strParameters.length(); p++ ) {
const QChar qc = strParameters.at(p);
if ( qc == clsXMLnode::mcucOpenSquareBracket ) {
intOpSB++;
continue;
} else if ( qc == clsXMLnode::mcucCloseSquareBracket ) {
intOpSB--;
continue;
}
if ( (intOpSB == 0 && qc == clsXMLnode::mcucArrayDelimiter)
|| p == strParameters.length() - 1 ) {
if ( strParameters.at(intPStart) == clsXMLnode::mcucArrayDelimiter ) {
//Skip over the opening bracket or array delimiter
intPStart++;
}
if ( intPStart > p ) {
continue;
}
int intEnd = p;
while( true ) {
if ( intEnd > 0 && (strParameters.at(intEnd) == clsXMLnode::mcucArrayDelimiter) ) {
//We don't want the delimiter or the closing square bracket in the parameter
intEnd--;
} else {
break;
}
}
if ( intEnd > intPStart ) {
QString strParameter = strParameters.mid(intPStart, intEnd - intPStart + 1);
//Update remaining parameters, skipping the parameter and any delimiter
strParameters = strParameters.mid(strParameter.length() + 1);
//Remove any quotes
strParameter = strParameter.replace("\"", "");
strParameter = strParameter.replace("\'", "");
//Add the parameter
mslstParameters.append(strParameter);
//Reset parameter start
intPStart = 0;
p = -1;
}
}
}
References:
mcucOpenSquareBracket is a constant defined as '['
mcucCloseSquareBracket is a constant defined as ']'
mcucArrayDelimiter is a constant defined as ','
mslstParameters is a member defined as QStringList
Upvotes: 0
Views: 325
Reputation: 275310
auto term = "(?:[^,<]*)"s;
auto chain = "(?:(?:"+term+",)*"+term+")"s;
auto clause = "(?:(?:"+term+")|(?:<" + chain + ">))"s;
auto re_str = "^(?:("+term+")|(?:<("+chain+")>))" "(?:|,((?:"+clause+",)*"+clause+"))";
re_str
takes your string, and splits off the first term or chain from the tail.
It returns up to 3 sub-matches. The first is a lone term. The second is a comma-delimited chain of terms. The third is the rest of the string after the ,
.
The tail is going to be empty, or another string that can be parsed using the above regular expression.
Chains of terms can be parsed by the same regular expression.
I matched <>
delimited chains of terms, not []
, because I got bored of \\
s.
You also want to discard whitespace around clauses. I omitted that, it should be easy to stitch in.
Upvotes: 2