Reputation:
I got a question... I got code like this, and I want to read it with PHP.
NAME
{
title
(
A_STRING
);
settings
{
SetA( 15, 15 );
SetB( "test" );
}
desc
{
Desc
(
A_STRING
);
Cond
(
A_STRING
);
}
}
I want:
$arr['NAME']['title'] = "A_STRING";
$arr['NAME']['settings']['SetA'] = "15, 15";
$arr['NAME']['settings']['SetB'] = "test";
$arr['NAME']['desc']['Desc'] = "A_STRING";
$arr['NAME']['desc']['Cond'] = "A_STRING";
I don't know how I should start :/. The variables aren't always the same. Can someone give me a hint on how to parse such a file?
Thx
Upvotes: 3
Views: 1642
Reputation: 3200
If the files are this simple, then rolling your own homegrown parser is probably a lot easier. You'll eventually end up writing regex with lexers anyway. Here's a quick hack example: (in.txt should contain the input you provided above.)
<pre>
<?php
$input_str = file_get_contents("in.txt");
print_r(parse_lualike($input_str));
function parse_lualike($str){
$str = preg_replace('/[\n]|[;]/','',$str);
preg_match_all('/[a-zA-Z][a-zA-Z0-9_]*|[(]\s*([^)]*)\s*[)]|[{]|[}]/', $str, $matches);
$tree = array();
$stack = array();
$pos = 0;
$stack[$pos] = &$tree;
foreach($matches[0] as $index => $token){
if($token == '{'){
$node = &$stack[$pos];
$node[$ident] = array();
$pos++;
$stack[$pos] = &$node[$ident];
}elseif($token=='}'){
unset($stack[$pos]);
$pos--;
}elseif($token[0] == '('){
$stack[$pos][$ident] = $matches[1][$index];
}else{
$ident = $token;
}
}
return $tree;
}
?>
Quick explanation: The first preg_replace
removes all newlines and semicolons, as they seem superfluous. The next part divides the input string into different 'tokens'; names, brackets and stuff inbetween paranthesis. Do a print_r $matches;
there to see what it does.
Then there's just a really hackish state machine (or read for-loop) that goes through the tokens and adds them to a tree. It also has a stack to be able to build nested trees.
Please note that this algorithm is in no way tested. It will probably break when presented with "real life" input. For instance, a parenthesis inside a value will cause trouble. Also note that it doesn't remove quotes from strings. I'll leave all that to someone else...
But, as you requested, it's a start :)
Cheers!
PS. Here's the output of the code above, for convenience:
Array
(
[NAME] => Array
(
[title] => A_STRING
[settings] => Array
(
[SetA] => 15, 15
[SetB] => "test"
)
[desc] => Array
(
[Desc] => A_STRING
[Cond] => A_STRING
)
)
)
Upvotes: 0
Reputation: 5743
It's not an answer but suggestion:
Maybe you can modify your input code to be compatible with JSON which has similar syntax. JSON parsers and generators are available for PHP.
Upvotes: 2
Reputation: 7615
This looks like a real grammar - you should use a parser generator. This discussion should get you started.
There are a few options already made for php: a lexer generator module and this is a parser generator module.
Upvotes: 5