Reputation: 532
This may be able to be accomplished with a regular expression but I have no idea. What I am trying to accomplish is being able to parse a string with a given delimiter but when it sees a set of brackets it parses differently. As I am a visual learning let me show you an example of what I am attempting to achieve. (PS this is getting parsed from a url)
Given the string input:
String1,String2(data1,data2,data3),String3,String4
How can I "transform" this string into this array:
{
"String1": "String1",
"String2": [
"data1",
"data2",
"data3"
],
"String3": "String3",
"String4": "String4
}
Formatting doesn't have to be this strict as I'm just attempting to make a simple API for my project.
Obviously things like
array explode ( string $delimiter , string $string [, int $limit = PHP_INT_MAX ] )
Wouldn't work because there are commas inside the brackets as well. I've attempted manual parsing looking at each character at a time but I fear for the performance and it doesn't actually work anyway. I've pasted the gist of my attempt.
https://gist.github.com/Fudge0952/24cb4e6a4ec288a4c492
Upvotes: 2
Views: 117
Reputation: 11689
This is a solution with preg_match_all()
:
$string = 'String1,String2(data1,data2,data3),String3,String4,String5(data4,data5,data6)';
$pattern = '/([^,(]+)(\(([^)]+)\))?/';
preg_match_all( $pattern, $string, $matches );
$result = array();
foreach( $matches[1] as $key => $val )
{
if( $matches[3][$key] )
{ $add = explode( ',', $matches[3][$key] ); }
else
{ $add = $val; }
$result[$val] = $add;
}
$json = json_encode( $result );
Pattern explanation:
([^,(]+) group 1: any chars except ‘,’ and ‘(’
(\(([^)]+)\))? group 2: zero or one occurrence of brackets wrapping:
└──┬──┘
┌──┴──┐
([^)]+) group 3: any chars except ‘,’
Upvotes: 1
Reputation: 96159
You can either build an ad-hoc parser like (mostly untested):
<?php
$p = '!
[^,\(\)]+ # token: String
|, # token: comma
|\( # token: open
|\) # token: close
!x';
$input = 'String1,String2(data1,data2,data3,data4(a,b,c)),String3,String4';
preg_match_all($p, $input, $m);
// using a norewinditerator, so we can use nested foreach-loops on the same iterator
$it = new NoRewindIterator(
new ArrayIterator($m[0])
);
var_export( foo( $it ) );
function foo($tokens, $level=0) {
$result = [];
$current = null;
foreach( $tokens as $t ) {
switch($t) {
case ')':
break; // foreach loop
case '(':
if ( is_null($current) ) {
throw new Exception('moo');
}
$tokens->next();
$result[$current] = foo($tokens, $level+1);
$current = null;
break;
case ',':
if ( !is_null($current) ) {
$result[] = $current;
$current = null;
}
break;
default:
$current = $t;
break;
}
}
if ( !is_null($current) ) {
$result[] = $current;
}
return $result;
}
prints
array (
0 => 'String1',
'String2' =>
array (
0 => 'data1',
1 => 'data2',
2 => 'data3',
'data4' =>
array (
0 => 'a',
1 => 'b',
2 => 'c',
),
),
1 => 'String3',
2 => 'String4',
)
(but will most certainly fail horribly for not-well-formed strings)
or take a look at lexer/parser generator like e.g. PHP_LexerGenerator and PHP_ParserGenerator.
Upvotes: 1
Reputation: 4849
While you could try to split your initial string on commas and ignore anything in parentheses for the first split, this necessarily makes assumptions about what those string values can actually be (possibly requiring escaping/unescaping values depending on what those strings have to contain).
If you have control over the data format, though, it would be far better to just start with JSON. It's well-defined and well-supported.
Upvotes: 1