Reputation: 2706
I'm trying to parse a string with the following structure in PHP:
a,b,c(d,e,f(g),h,i(j,k)),l,m,n(o),p
For example, a "real" string will be:
id,topic,member(name,email,group(id,name)),message(id,title,body)
My end result should be an array:
[
id => null,
topic => null
member => [
name => null,
email => null,
group => [
id => null,
name => null
]
],
message => [
id => null,
title => null,
body => null
]
]
I've tried recursive regex, but got totally lost. I've got some success with iterating over the string characters, but that seem a bit "over complicated" and I'm sure that is something a regex can handle, I just don't know how.
The purpose is to parse a fields query parameter for a REST API, to allow the client to select the fields he wants from a complex object collection, and I don't want to limit the "depth" of the field selection.
Upvotes: 4
Views: 2192
Reputation: 43169
As Wiktor pointed out, this can be achieved with the help of a lexer. The following answer uses a class originally from Nikita Popopv, which can be found here.
It skims through the string and searches for matches as defined in the $tokenMap
. These are defined as T_FIELD
, T_SEPARATOR
, T_OPEN
and T_CLOSE
. The values found are put in an array called $structure
.
Afterwards we need to loop over this array and build the structure out of it. As there can be multiple nestings, I chose a recursive approach (generate()
).
A demo can be found on ideone.com.
The actual code with explanations:
// this is our $tokenMap
$tokenMap = array(
'[^,()]+' => T_FIELD, # not comma or parentheses
',' => T_SEPARATOR, # a comma
'\(' => T_OPEN, # an opening parenthesis
'\)' => T_CLOSE # a closing parenthesis
);
// this is your string
$string = "id,topic,member(name,email,group(id,name)),message(id,title,body)";
// a recursive function to actually build the structure
function generate($arr=array(), $idx=0) {
$output = array();
$current = null;
for($i=$idx;$i<count($arr);$i++) {
list($element, $type) = $arr[$i];
if ($type == T_OPEN)
$output[$current] = generate($arr, $i+1);
elseif ($type == T_CLOSE)
return $output;
elseif ($type == T_FIELD) {
$output[$element] = null;
$current = $element;
}
}
return $output;
}
$lex = new Lexer($tokenMap);
$structure = $lex->lex($string);
print_r(generate($structure));
Upvotes: 3